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Abstract 

This study is an analysis of the distributed version of database concurrency control. It provides 
concrete mathematical evidence that the distributed problem is an inherently more complex task 
than the centralized one. 

The notions of transaction, concurrency, history, scrializability, scheduler, etc., for centralized 
databases are now well-understood both from a theoretical and a practical point of view. A formal 
model for the case of distributed databases is presented. The transactions are partially ordered sets 
of actions, as opposed to the totally ordered straight-line programs of the centralized case. The 
scheduler is also a distributed program. Three notions of performance for a scheduler arc studied 
and interrelated: (i) parallelism, (ii) the computational complexity of the decision problems that it 
has to solve, (in) the cost of communication between the various parts of the scheduler. In fact the 
number of messages necessary and sufficient to support a given level of parallelism is equal to the 
length of a combinatorial game. This game, which captures the difference between the centralized 
and the distributed problem, is PSPACE-Complete. This implies that unless NP=PSPACE, a 
scheduler cannot simultaneously minimize the communication cost and be computationally efficient. 

The model presented can also serve as a framework for the study of distributed concurrency 
control by locking. For two transactions an efficient characterization of safe distributed locking 
policies is derived. The new graph-theoretic approach generalizes the geometric method used in the 
centralized case. 
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1. Introduction 

There is now considerable .literature, both theoretical and applied, concerning 
the database concurrency control problem - that is, maintaining the integrity of a 
database in the face of concurrent updates. Most of the theoretical work so far has 
been concerned with the centralized problem, in which the database resides at one 
site, and the update requests are submitted to a single process, called the scheduler, 
which implements the concurrency control policy of the database [7,25,37]. There is 
also some interesting applied work on distributed databases [2,3,4,28,36}. It is often 
said that the concurrency control problem is .much trickier and harder in the 
distributed case, than in the centralized case. This is evidenced by the existing 
solutions, which are extremely complex and -sometimes incorrect 

In this thesis we examine how the complexity of various problems, related to 
concurrency control, is affected when we attempt to solve them for distributed 
databases. The main focus is in two areas, seriaUzability and safe locking policies, 
where efficient centralized solutions exist Our apj^ch and results also add to the 
theory of distributed computation, independently of their database context 



1.1 The Main Goals and New Results 

Our main goal is to demonstrate the differences between the centralized and 
distributed versions of natural computational |>roblems. We examine such problems 
from the area of database concurrency control, because we also wish to determine the 
limits of performance of concurrency control mediamsms. 

We investigate two features of distributed computation, which distinguish it 
from centralized computation. First, the uncertainty of the order of events in a 
distributed environment [19]. The order of events is no longer best viewed as total, as 
in the centralized case; instead it is a partial order, whose structure depends on the 
number of sites of the distributed system. So our analysis will highlight differences 
between total and partial orders. The second element irlhe need for communication 
between sites, if the performance of an on-line distributed system is to match that of 
an on-line centralized system. 

In order to find concrete differences we compafe the computational complexity 



of centralized and distributed tasks. We will use standard concepts from the theory of 
computational complexity, (i.e., deterministic polynomial time P, nondeterministic 
polynomial time NP or its complement co-NP and polynomial space PSPACE, 
[1,11,33,34]), as well as notions from the theory of combinatorial games [5,8,29]. The 
contributions of this thesis are summarized in the next three sections. 



(I) The Model 

We have developed a simple mathematical model of distributed databases, 
which captures the intricacies of distributed computation that are most pertinent to 
the database domain- Some novelties of our model are: 

(1) User transactions are arbitrary partial orders of atomic steps, thus 
generalizing the straight-line programs of the centralized case. The order 
corresponds to both time-precedence and information flow, and it captures 
the notion of "distributed time*\ 

(2) The scheduler, the concurrency control agent of the system, is itself a 
distributed program, consisting of communicating sequential processes [15], 
one for each site. 

(3) Redundancy (the requirement that two entities stored at different sites be 
two copies of the same "virtual entity") is not treated at the syntactic level, 
but is considered as part of the integrity constraints of the database. 
Redundancy was at the root of the complexities of most previous attempts to 
formalize distributed databases. 

As a consequence, there are three measures of performance in a distributed 
database (centralized theory deals with the first two): 

(a) Parallelism, measured as the set of allowable interleavings of user actions. 

(b) Complexity of the computational problems that the processes of the 
scheduler must solve. 

(c) Communication, measured as the number of message exchanges between 
the processes of the scheduler. 



A simple analysis, Theorems 1 and 2, verifies that the model is indeed a 
consistent generalization of the centralized model. 



(II) Schedulers and Games 

The three measures of performance of schedulers present interesting tradeoffs. 
For example, let us fix (a) (think of it as the parallelism specs of the system). By 
expending many messages, we can reduce the problem of distributed concurrency 
control to the centralized one (by broadcasting each request) and thus solve it in 
polynomial time for most reasonable specs £251. ft turns out that, based on a priori 
information about transactions, we can minimize the number of messages sent, by 
executing an exponential number of computation steps (and using polynomial space; 
this is the upper bound of our main result). Finally we cannot have a scheduler 
simultaneously using the minimum number of messages and running in polynomial 
time at each site, unless NP=PSPACE (this follows from the lower bound). 

Specifically our main result states that for a certain parallelism specification 
(which in fact can be fixed to be the popular serializability principle [3,17,25,31,40]) 
minimizing communication costs is a computational problem complete for PSPACE 
[Ul.33,34]. thus,, our result appears to be concrete mathematical evidence 
suggesting that distributed concurrency control is indeed an inherently more complex 
task than centralized concurrency control (under quite general conditions, centralized 
schedulers can be implemented in polynomial time [25]). 

Our result also adds to the literature on distributed computation, independently 
of its database context It states, loosely speaking, that one cannot tell efficiently 
whether distributed processes can cooperate successfully for performing (an 
otherwise easy) on-line computational task, at fixed communication cost It can 
therefore be considered as complementing the result of Ladaer for lockout properties 
of "antagonistic" processes [18]. On the other hand, Yao has asked [38J whether 
minimizing communications costs for some distributed combinational computation is 
computationally intractable; we answer this in the case of an on-line computation. 

The proofs of both our upper and lower bounds are quite intricate. For the 
upper bound we need a complicated characterization (Theorem 3) of the incomplete 
histories of actions (i.e., partial orders of events in the system) that can be completed. 



within a fixed number of messages. This upper bound holds for serializable histories, 
as well as for all similar parallelism specifications that can be achieved in a 
centralized manner. For the lower bound we relate distributed scheduling to a game 
played on graphs (the "conflict" graph of the transactions). Intuitively one player 
(Player II) is the distributed scheduler, and the other (Player I) is an adversary who 
submits user requests so as to force the scheduler to use as many messages as 
possible. Player I wants to prolong the game as much as possible, whereas Player II 
tries to bring it to an end as soon as possible (other than that there is no winner or 
looser). The rules are related in a simple way to the cycles of the graph. We prove 
that this game is complete for PSPACE, and then show that our constructs can 
faithfully reflect a special kind of distributed concurrency control situation. Both 
steps involve intricate "gadget" construction CTheorem 4). 



(HI) Distributed Locking 

A very common way of implementing concurrency control is by locking. In this 
method each entity is equipped with a binary semaphore (its lock) and transactions 
synchronize their operation by locking and unlocking the entities that they access. 
The purpose of locks is not mutual exclusion of shared resources as in operating 
system theory. Instead they are used to enforce correct sequencing of the indivisible 
transaction steps. 

Locking policies have been extensively studied in the centralized case 
[7,D,2L26,30,39,401 and applied to distributed databases [22,23,35]. Our model 
provides a framework for the rigorous study of distributed locking. 

The most elegant result in the theory of centralized locking is a geometric 
method, which efficiently characterizes the safe locking policies for two transactions. 
We examine the distributed version of this problem (i.e., when the transactions are 
partial orders instead of total orders of steps). We propose an alternative graph- 
theoretic approach for the centralized problem, which in addition provides an 
efficient sufficient condition for the distributed problem (Theorem 5). This condition 
is also necessary for transactions distributed at two sites (Theorem 6). Therefore this 
is a positive result (as opposed to the negative complexity results of Chapter 4). It 
also indicates how the difficulty of the problem may be affected by the number of 
sites at which wc distribute it 



The material is organized as follows. Section 1.2 contains a review of database 
concurrency control, in which the various notions and results in the area are briefly 
described. Chapter 2 consists of the model definition (Section 2.1) and its simple 
properties, Theorems 1 and 2 (Section 2.2). The relation of distributed scheduling 
and games is rigorously established in Chapter 3. An upper bound on the complexity 
of the distributed problem is derived in Section 3.1 (Theorem 3). The games are 
defined in Section 3.2. Chapter 4 is an analysis of the complexity of these games and 
contains the main technical result, the lower bound in Section 4.1 (Theorem 4). The 
consequences of this result on the existence of schedulers are in Section 4.2. Chapter 
5 provides a framework for the study of distributed locking (Section 5.1), and a 
characterization of safe two-transaction systems (Section 5.2), Theorem 5 for 
sufficiency and Theorem 6 for necessity. Finally, Chapter 6 contains the conclusions 
and a list of open problems and directions for further research. 

The material on the model definition (Chapter 2) and distributed locking 
(Chapter 5) represents a joint effort with Prof. C.H. Papadimitriou. Part of this work, 
namely Chapters 23 and 4 appear in [16]. 



1.2 A Review of Database Concurrency Control 

A database consists of a set of named data objects called entities. The values of 
these entities must at any time be related in some ways, prescribed by the consistency 
requirements (or integrity constraints) of the database. When a user accesses or 
updates a database, he may have to violate temporarily these consistency 
requirements, in order to restore them at some later time, with the specific data 
changed. For example, in a banking system, there may be no way to transfer funds 
from an account to another in a single atomic step, without temporarily violating the 
integrity constraint "the sum of all balances equals the total liability of the bank". 
For this reason, several steps of the interaction of the same user with the database are 
grouped into a transaction. Transactions are assumed to be correct, that is, they are 
guaranteed to preserve consistency when run in isolation from other transactions. 

When many transactions access and update the same database concurrently, the 
consistency of the database may fail to be restored after all transactions have 
completed. If, for example, transaction 1 consists of the two steps 

x:=x-100 ; 

x:=x+100 

and transaction 2 of the single step x:=1.15 * x, and the consistency requirement is 
simply "x=0'\ then executing transaction 2 between the two steps of transaction 1 
turns a consistent database into an inconsistent one. This is despite the fact that both 
transactions are individually correct, that is, each preserves database consistency when 
run alone. We must therefore find ways to prevent such undesirable interleaving, 
without excessively harming the average user delay and other measures of the 
efficiency of the system. This is the database concurrency control problem, already 
discussed extensively in the literature (see [37]). 

In this section we present a brief (and by no means complete) review of the 
many results on concurrency control. We start by describing the elements of 
mathematical models used to study these problems in the centralized case. This 
setting will help us to present the theory of centralized database concurrency control 
(part-a). We then discuss how distributing the database affects the formulation of the 
problem and describe some of the proposed practical solutions (part-b). 



(a) The centralized case 

Intuitively a database consists of entities and a finite set of transactions. Each 
transaction is a total order on its actions, which are operations performed indivisibly. 
An action p of a transaction T is, in general, an update (i.e., a read and then a write) 
of an entity x p , based only on the values of entities updated by actions that precede 
this action in the order of T. 

A history, for a set of transactions r={T 1 ...T m }, is a total order representing an 
interleaving of all transaction steps. It is therefore a total order respecting all 
transaction steps. It captures the order of events at the one site, where the database is 
stored. A prefix of a history h is an initial portion of h. H is the set of all histories, 
that is, all interleavings for all sets T of transactions. 

We are interested in correct histories (i.e. histories that take the 'database from a 
correct initial to a correct final state). A well-known and generally accepted correct 
subset of H is that of serializable histories (SR). A serial history is one with no 
interleaving of actions of different transactions. A history is serializable iff it is 
equivalent (in the obvious schema-theoretic sense with uninterpreted function 
symbols for updates) to some serial history. Since each transaction is by itself correct 
a serializable history is obviously correct Serializability has been widely recognized 
as the right notion of correctness (e.g., [2,3,4,17,25,31,40]). In fact it is shown in [17] 
that it is the most liberal notion of correctness, possible, when only syntactic 
information (Le^ entity names) is available. 

A scheduler is an algorithm 'handling incoming requests. It might use a priori 
information (e.g, the syntax of 7) and run time information (e.g., the order of 
incoming requests). The input and output of a scheduler are strings of actions in I. In 
fact, one is the history of requests and the other the history of their execution. A 
scheduler is said to realize a set of histories C (where G is a subset of H) if: 
(i) for all inputs, -the output is a sequence in C, 

(ii) for all inputs in C, the scheduler grants all requests immediately upon receipt 
This captures the on-line and optimistic features of schedulers [25]. 

These sets C were proposed in [25] as a measure, whereby the performance of 
schedulers can be evaluated in a uniform setting. This measure expresses the class of 
all sequences of transaction steps that can be the response of the concurrency 
controller to a stream of execution requests. The richer this class, the fewer 
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unnecessary delays and rearrangements of steps will occur, and the greater the 
parallelism supported by the system. 

A second measure of performance of a scheduler is the computational complexity 
of the decision problems it must solve. 

The area of concurrency control was unified in [25] by formulating the problem 
as a relation between the two performance measures: 



CC: The problem of Concurrency Control is, given a set C of correct histories, 
find a scheduler which realizes it and is computationally efficient 



A basic theorem in [25] is that such a scheduler exists iff the prefixes of C are 
polynomial time recognizable (i.e. in P). 

The obvious question in this setting is whether an efficient serializer (Le., 
scheduler realizing SR) exists. The answer is yes. Testing a history for serializability, 
or a prefix for whether it has a serializable completion, is an easy task in the 
centralized case. The algorithm is based on conflict graphs. The conflict graph G(7) 
for a transaction system 7 is a multigraph, with a node for each transaction in 7 and 
an edge between 1\ and T2 labeled x, whenever 1\ and T2 both update entity x. 
The order of executions of actions m a history assigns directions to the edges of 
G(7). We call this resolving the conflicts between transactions. This result is the 
"folk" theorem of concurrency control [2,17,25,28,37]: 

"A history h is serializable iff it resolves conflicts without creating directed cycles 
in G(7). Similarly, a prefix has a serializable completion iff the already resolved 
conflicts do not create. a directed cycle in G(7)." 

The pioneering work in the field was [7], which also introduced concurrency 
control mechanisms such as two phase locking and predicate locks. It was followed by 
many interesting contributions (e.g. [2,13,31]). A number of concurrency control 
mechanisms were compared in the uniform setting of the parallelism measure C 
introduced by [25], where CcSR. Moreover it was shown, that if we distinguish 
between read and write actions then deciding whether a history is serializable (i.e. in 



SR) becomes NP-Complete [25]. 

A very common way for implementing concurrency control is locking. In this 
method each entity is equipped with a binary semaphore (its lock) and transactions 
synchronize their operation by locking and unlocking the entities that they access. In 
fact, variants are possible in which locks of different kinds are defined, and certain 
kinds may coexist whereas others may not (e.g. shared or read locks, intention locks 
[13]). The lock-unlock steps are inserted in a transaction according to some locking 
policy. A locking policy may have the property that, if all transactions are locked 
according to it, then any execution respecting the locks is guaranteed to be 
serializable. Such a locking policy is called safe. 

Given a transaction system T, there are certain well-known locking. policies that 
can be applied to it One is the two-phase locking (2PL) policy [7]. In it we insert 
locks surrounding the accesses of all entities, in each transaction subject to the 
following rule: The last entity to be locked is locked before the first entity is 
unlocked. Thus the transaction is divided into two phases: the locking phase, during 
which locks are acquired but not released, and the unlocking phase, in which locks 
are released but not requested. In an extremely conservative interpretation of this 
policy, we could lock all entities before the first step, and unlock them after the last 
More reasonably, we could request locks for entities at the first step that they are 
accessed, and release locks at the end of the transaction. In fact, it is shown in [17] 
that the latter interpretation of 2PL is the best possible concurrency control, when 
syntactic information is acquired in an incremental, dynamic manner. It was first 
shown in [71 that 2PL is safe (though deadlock-prone). 

If the entities are unstructured (that is, transactions access them in all possible 
patterns) then 2PL is the best possible locking policy. Suppose, however, that the 
entities form a tree, and are accessed by transactions as follows: 
(i) A transaction accesses a subtree, whose root is the first entity to be accessed (after, 
of course, it is locked), 
(ii) After this, when an entity is locked, its parent must be locked and not yet 

unlocked. 

Then this locking policy, called the tree policy is shown in [30] to be both safe and 
deadlock-free. This holds for the more general digraph policy of [39]. In fact, the 
latter is generalized in [39] to the hypergraph policy which, it is proved, is the most 
general possible safe and deadlock- free policy. 
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Safe locking policies were characterized in [39]. The limitations of the parallelism 
that can be provided by locking were investigated in [26]. Safety of two-transaction 
locked systems can be efficiently decided [21], by employing a geometric 
methodology reminiscent of that used by Dijkstra for studying deadlocks [6]. Besides 
its independent interest and elegance, the two-transaction solution is the building 
block for resolving the general case. It turns out that a locking policy defined on d>2 
transactions is safe iff all of its two-transaction subsystems are safe, plus a 
combinatorial condition. This combinatorial condition turns out to be NP-Complete, 
but it is simple enough to have some interesting corollaries. For example, all specific 
locking policies mentioned above can be shown to be safe as immediate 
consequences of the condition. 



(b) The distributed case 

The assumption that the database is stored at one site is not always true. 
Distributing the database among various sites might be necessary and even desirable. 
In fact the current trend in technology is towards distributed databases 
[23,4,28,35,3$]. 

In a distributed environment the transactions, histories and prefixes become 
partial orders and the scheduler consists of many communicating sequential 
processes, one at each site. The model presented in Chapter 2 abstracts the relevant 
properties of transactions, actions, histories, prefixes, and schedulers. It extends the 
parallelism measure of schedulers, the concept of serializability and conflict graphs to 
the distributed case. The new elements are, that the scheduler uses message passing 
between sites and that the conflicts are partitioned into the conflicts at every site. The 
problem of Distributed Concurrency Control (DCC) can be formalized as was that of 
Concurrency Control (CQ. A rigorous treatment of this problem will require the 
selection of a formal system, in which to express distributed algorithms e.g. [9,15,24]. 
Such a system, with the least possible restrictions, is selected in the next chapter. 

The problem of concurrency control has been examined by designers of 
distributed databases and various solutions have been proposed. Because of other 
important considerations in a distributed environment, concurrency control is 
viewed (and rightly so) as only one of a number of goals of such systems (e.g. other 
problems are, optimal partitioning of the database, distributed query processing [12], 
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properties of the communication medium, importance of deadlocks between sites 
[22,23], reliability of updates [14]). What is not clear from these involved distributed 
algorithms is, whether the distributed version of concurrency control, by itself, is a 
more complex task than its centralized version. This in fact is the subject of the 
present study. 

A survey of distributed database concurrency control algorithms is contained in 
[4]. These algorithms are classified into methods using transaction timestamps to 
resolve conflicts [19] and methods using locking (particularly the two phase locking 
rule) [7]. The methods are compared on the basis of the three measures indicated in 
Section 1.1 (Le. parallelism, complexity, communication), with an additional 
distinction between delaying or aborting requests that cannot be safely granted. 
Another issue that is investigated is the effect of having conflicts between read and 
write actions or write and write actions. There are methods, which cannot be 
classified into this timestamp v.s. locking scheme (e.g. voting methods used in [36]). 
There are also experimental comparative studies [10,20]. 

A concurrency control method, which stands out among all these algorithms is 
that employed by SDD-1 [2,3]. The reason for this is its preanalysis of a-priori 
information (i.e., the structure of the conflict graph) in order to enhance parallelism. 
An obvious question is, why should not a similar preanalysis be used to enhance the 
communication between the processes of the scheduler. 

Finally let us mention a new research direction, which developed from the 
distributed problem, but is important even for the centralized case. It is tacitly 
assumed that there is one version of each entity in the database and an update 
creates a new version making the old one obsolete. It might be possible to use older 
versions in addition to the conflict graph, in order to perform concurrency control. 
This is done by changing the semantics of "read" and "write" (e.g., Reed's rule [27], 
before-and-after values [32]). This change in the model can have profound 
consequences, since it introduces a space-parallelism tradeoff (i.e., by using more 
versions the sets of interleavings C that can be realized by schedulers can be 
enriched). 
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2 A Model of Distributed Database Concurrency Control 

This chapter contains the definition of our model for distributed database 
concurrency control. This model generalizes the centralized model, is simple and can 
be used for the analysis of all practical solutions proposed to date. 



2.1 Model Definition 

A distributed database is a collection of sites. Each site has its own processor and 
data. The sites are interconnected by a network and are controled by a distributed 
database management system (DDBMS). In Fig. 2.1 we show the architecture of a 2- 
site system; horizontal arrows join modules of the same distributed process. 
Formally, a distributed database is defined as follows: 



Definition 1 : A Distributed Database Design (DDD) is a quadruple <Gj) , Data, 
Stored-at, IO where: 

(0 Gj)=Qf£) is a graph, where every node corresponds to a site and every link 
to a two-way communication link between sites. 

(ii) Data is a set of variables (or entities), denoted {x,y,z,...} 
(i.e. physical data items). 

(iii) Stored-at : Data -* V is a Junction that determines the site, where each 
physical data item is stored. 

(iv) IC is a set of integrity constraints on the values of the Data.D 



Note that multiple copies of the same logical data item are considered as different 
physical data items stored at different sites. The fact that they are copies and must 
remain identical for reasons of consistency is part of the integrity constraints, and is 
not treated separately. 

The users interact with the database using transactions. In our model a transaction 
is a distributed program, not identi fied with a particular site. 
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Definition 2 : A transaction T, in a given DDD, is a directed acyclic graph (dag) 
T=(N,A) such that: 

(i) every node p is associated with one site of the system, site(p) and with an 
entity x„ stored at that site. 

(ii) all nodes associated with the same site are totally ordered in A. 
A transaction system J is a set of transactions {Tj}.D 



Note that it is assumed that transactions are correct programs (e.g. update all 
copies of the same logical item in order to preserve the integrity of the database). We 
denote the partial order imposed by a transaction Tj on its actions as > Ti . 



Definition 3: The nodes of a transaction are the actions performed by the 
transaction. The semantics of an. action p is the indivisible execution of the following 
two steps 

*P '~ X P 

Xp := f p ( t p ,.„,tq,...) where q ranges over all actions that are ancestors of p in 

the transaction of p. 

Here the t's are temporaries (i.e., a workspace local to the transaction) and the 
x's are physical items in the database. The fj's are uninterpreted function symbols.D 



Hence the nodes of transactions stand for indivisible actions. We do not specify 
the details of the exact nature of the computation performed by each action. Instead 
we view an action p of a transaction T as an uninterpreted function symbol f p , with 
one output and |{qj q > T p}| + l inputs. The transactions are in fact program 
schemata, where all updates are treated by the concurrency control mechanism as 
uninterpreted updates. Designing the database (i.e., deciding how many copies of 
each item there are and where they are stored) and writing correct transactions (e.g M 
which copies to update, which other integrity constraints to satisfy) are problems at a 
higher level than concurrency control, and are not treated here. 
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The particular model of actions used was chosen for its clarity. Other models, 
such as those illustrated in examples 2 and 3 below could as well have been used, to 
produce results similar to those of Chapters 3 and 4. 

Example 1 : Consider the transaction of Fig. 2.2(a). Actions 1,2,3 are performed 
at site 1, actions 4,5 at site 2, and 6 at site 3. The actions performed at the same site 
are totally ordered. The actions are updates as in Definition 3, so every node can be 
associated with a variable and the site this variable is stored at. This model 
generalizes the centralized model of [17]. 

Example 2 : Consider the transaction of Fig. 2.2(b) with actions (1,2),(3,4),(5,6) 
performed respectively at sites 1,2,3. If p is odd it is a read action with a readset of 
data items stored at its site. If it is even it is a write action with a writeset instead, and 
this update depends on all readsets (e.g., action 6 has writeset W"[x,y] and depends 
on readsets R^w], R 3 [u,v], R 5 [x], where w is stored at 1, u,v at 2, and x,y at 3). This 
type of actions and transaction is used in SDD-1 [2,3]. 

Example 3 : Consider the transaction of Fig. 2.2(c), where action j is performed 
at site j (there is only one action per site). DatasetQ), of arbitrary cardinality, is 
updated based on its previous values and those of datasets of ancestor actions. This is 
a very simple model that makes the centralized version trivial (a transaction is an 
action), yet it presents interesting problems in the distributed case. 

An edge in a transaction T between actions at different sites (called a cross-edge), 
denotes both temporal precedence and a transfer of information (i.e., in Fig. 2.2(a) 
update 5 needs data from update 1). These cross-edges correspond to user-defined 
messages, which the system must service. 

A history is a description of a set of transactions and the process of their 
execution on the system. In a distributed system [19] it is in general impossible to tell 
which one of two events occured first, (because communication is not always 
instantaneous). Because of this uncertainty, we describe the execution order of the 
actions by a partial order. If two events are incomparable in this partial order, any 
one could have preceded the other. There -are two restrictions on the partial orders. 
First, what happens at every site is totally ordered; this is consistent, with the 
centralized problem and guarantees that the result of the execution is uniquely 
determined as in the case of individual transactions. Second, user-specified 
precedences are always respected. Formally: 
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Definition 4: A history is a pair <7>>, where f={ Tj ,l<i<m} is a transaction 
system and * is a directed acyclic graph (dag) on the nodes of the transactions Tj 
such that: 

(i) Nodes p with the same site(p) are totally ordered. 

(ii) For any transaction Tj and actions p,q€ Tj and p > Ti q we have that p\q 
(where >, denotes the partial order imposed by w). □ 



Definition 5: A prefix of a history h=<r,»> is a pair <T,a>, where a is the 
induced subgraph of v by a subset of its nodes such that, if action p€o all ancestors q 
of p in w belong to aJO 



A history may be viewed as a special case of a parallel program schema (see Fig. 
13). The resulting schemata and the rigorous treatment of their equivalence under 
Herbrand interpretation [25] closely resemble the centralized case. 



Definition 6 :T wo histories hi=<7>i> and h2=<7>2> are equivalent (hissl^) 
iff their schemata are strongly equivalent (that is equivalent under the Herbrand 
interpretation of the function symbols and variables).D 



Let H denote the set of all histories. Recall that a partial order can be considered 
as a set of total orders (those compatible with it). Let H + denote the set of all 
histories <2», where v is a total order. Therefore a history represents a particular 
subset of this basic set H + . The histories with only transaction-defined cross-edges 
(arcs between actions at different sites) are maximal when considered as sets of total 
orders. Yet histories can have other cross-edges also (e.g.,arc (4,6) in Fig.2.3), whose 
presence restricts the allowable total interleavings of actions. The goal of concurrency 
control is to recognize on-line large sets of correct total interleavings. 
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Since individual transactions are correct (i.e., take the database from a correct 
initial to a correct final state), histories in which transactions are executed one after 
the other (serial histories) are correct Also those histories that are equivalent to 
them, called serializable, are correct We denote the set of serializable histories by SR 
( SRgH). 



Definition 7: A history h is serial iff 

(i) The execution of actions at each site introduces a total order of transactions at 
that site (i.e. there are no transactions Tj.Tj i*j with actions p,q€ Tj, r€ T; performed 
at the same site with p preceding r and r preceding q). 

(ii) If Tj precedes T; at one site it does so at all sites, where both transactions 
have actionsJ3 



Definition 8: A history is serializable iff it is equivalent to a serial history.D 



In the next section we will show that deciding serializability is an easy task. This 
task becomes NP-Conqtlete if the model with read and write actions (instead of 
updates) is used [251. Even in that case SR has interesting efficiently recognizable 
subsets (Le M DSR[25]). What is significant, is that deciding whether a history is 
serializable in a centralized or distributed model are practically identical tasks. We 
discuss this similaiity in the next section. 

As in the centralized case, synchronization is necessary only between actions of a 
transaction system which operate on the same data (i.e., conflict). These conflicts are 
represented by the conflict graph G(I). 
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Definition 9 : For the transaction system r={Tj, l<i<m}, the conflict graph 
G(7) is an undirected multigraph (V,E), with a partial order >j associated to the 
edges incident upon each node i, such that: 

(a) V = {i| l<i<m}, with node i corresponding to transaction Tj. 

(b) E is a multiset of edges. E= { copies of edge ij | for every copy of ij there is a 
distinct pair of actions {p,q} with pGTj, q€Tj, i*j and x p = x q } 

(c) For two edges incident at node i we have ij >j ik iff the action in Tj 
corresponding to ij is identical to or precedes the action in Tj corresponding to ik.D 



Note that an edge in E denotes a conflict between two transactions. Every edge ij 
in E corresponds to a pair of actions {p,q}, which update the same variable. Based 
on where this variable is stored we can partition E into as many multisets as there are 
sites (e.g., "red** and "green" edges for two sites). For an example see Fig. 2.4. 

An ordered mixed multigraph G=(V,E,A.{>i}) is a mixed multigraph with E a 
multiset of edges, A a multiset of directed edges and a partial order >j at each node i 
of the edges incident at the node. Conflict graphs are such objects with A=0. 

Since a conflict (or an edge in G(7)) corresponds to two actions at the same site 
and a history h=<7>> has a total order of the actions at each site, we can say that a 
history resolves all conflicts. That is, if edge ij corresponds to the pair of actions 
{p,q}, peTj, q€Tj, i*j, we direct ij from i to j iff p> w q. 



Definition 10 : A prefix <T,a> of a history assigns a direction (ij) to an edge ij of 
the conflict graph G(I) iff all histories, which have <T,«> as prefix, assign ij the 
direction (ij). Thus .a prefix <T,a> determines an assignment of directions to some 
edges of the conflict graph. 

Conversely an assignment of directions to edges of the conflict graph is 
realizable by a prefix, if there is a prefix of a history assigning these directions and 
no others.D 



Thus a prefix <7>> determines a unique ordered mixed multigraph G a (7), 
which is G(7) with some of its edges directed. 
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Up until now the . distributed problem appears to be a straight-forward 
generalization of the centralized case. What is considerably more complex in the 
distributed case is the subject of schedulers, and their design to meet performance 
specifications. For an exposition of the relatively simple theory for the centralized 
case see [25]. 

Our schedulers will be distributed algorithms characterized by the parallelism 
they provide and by their efficiency. We will measure parallelism using sets of 
histories C, that is subsets of H. The efficiency of the schedulers will be measured by 
the worst-case number of steps they execute and the worst-case number of messages 
they use. We will be interested in the following special C's: 



Definition 11 : Consider a set of histories CcH, such that for each h€C the only 
cross-edges (edges between actions at different sites) are defined by the transactions. 
Such a C we shall call a concurrency control principle. □ 



C is chosen in such' a way, that all h€C are correct. The larger C is, the higher 
the level of parallelism supported by this concurrency control principle. Examples of 
concurrency control principles are serializability and serial (one-at-a-time) execution . 
Obviously, the former supports more parallelism. Thus concurrency control 
principles are very natural classes of histories measuring parallelism, although not all 
subsets of H can be expressed as such. 



A scheduler Q is a distributed algorithm. (We do not explicitly specify the 
model of computation, although we shall use a concurrent language notation as 
needed). It consists of a set of communicating sequential processes [15], one for 
each site. Its instructions may involve the following: 

1) Local Computation 

2) Receiving an execution request for an action q. 

3) Granting an execution request of an action q. 

4) Sending a message to another site (i.e. se/d(<mcssage>)) 

5) Receiving a message from another site 
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Each history h corresponds to a set .{h + }of total orders (those that do not 
contradict h). Let h + denote any total order which respects the partial order of 
history h. If C is a set of histories, we let C + ={h + MQ. H is the set of all 
histories. An element of H + is a string, that is, a mapping from {l,2,...,n} to N, 
where N is the set of all actions and |NJ =n. In fact it is a pair <T, string>, but we 
omit T when it is obvious from the context The jth symbol of h + €H + is 
denoted by hj + . 

We thus assume that there is a total order on the arriving execution requests. 
This is a simplifying analytical tool (a formalism of the familiar notion of a 
timestamp ) and is not used by the scheduler, whose processes still perceive the 
world in terms of partial orders. We therefore have a global clock, whose ticks 
are the arrivals of execution requests. This sequence of execution requests is the 
input of the scheduler. What is the output of a scheduler? It cannot be just a 
sequence of actions, as the relative ordering of the granting of requests with 
respect to their arrival is also important The output of the scheduler is an n- 
tuple of strings S=(si^2—« SnK(N*) n . Here Sj denotes the sequence of granted 
requests between the jth and Q+l)-st (after the jth if j=n) arrivals of requests. 
N is the set of all strings constructed from the set of actions N and includes the 
empty string. The concatenation of the n strings, conc(S), should be in H + . 

Thus a scheduler Q, besides being a distributed algorithm, is a 
nondeterministic mapping, (i.e. a set of mappings) from H+ to (N*) n . 

For each total order h + , Q will produce a stream S of granted requests; one 
nondeterministic element is that of the various communication delays, A set of 
communication delays is a function d, which assigns to each execution of a send 
instruction by a process of Q a nonnegalive real number. Not all functions are 
delay functions. The delay function has to be feasible, in that an action p must 
be executed before a successor q of p, in its transaction, can be requested. Note 
that the zero Junction d=0 is always a feasible delay function. Therefore the 
mapping QdiH + -*(N*) n is well-defined for each feasible delay function d, 
assuming that local computation proceeds at a rate far faster than the arrival of 
requests and messages.D 
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Consider a set of histories CcH. Scheduler Q realizes C if all outputs of Q are 
in C- and thus presumably correct- and, furthermore, if Q is fed with a history in C 
and all delays are 0, then Q grants all requests without making them wait It is 
argued in [25] that these are traits, in the centralized case, of all schedulers that are 
on-line and optimistic (two intuitive properties shared by all existing schedulers). The 
same arguments are applicable to justify Definition 12, where total orders and strings 
of actions are used to formalize this intuition. 

Each process makes decisions about whether to grant or delay pending requests. 
These decisions can only depend on the information available to each process 
(i.e., Tand the requests that it knows have been granted or are pending). This can be 
viewed as a consequence of the power of the set of instructions used (see above). 



Definition 12: We say that Q is a realization of C iff 

(a) conc(Q d (h + ))eC+ for all h€H, and delay functions d. 

(b) Q o (h + )=(h 1 + ,...0i n + ) for all heC.D 



We illustrate the above definition in Fig.2.5. If h + €H + is the input to Q there 
are many possible computation paths (i.e., sequences of events in the system). This is 
because of the essentially random delivery time of the messages. So every path has 
associated with it the delays of messages used along this path and has output 
(si^'-^nKO^*) 11 - The conditions are that the granted requests always form a 
correct history (a) and, moreover, if requested actions form a correct history and all 
delays are zero, then the requests must be granted immediately (b). These conditions 
must hold for all computation paths. So there is a difference between the use of the 
term noiidetermmism above and that of classical complexity theory. 

There also is a feedback effect from output to input (i.e., requests cannot be 
made if their ancestors in transactions have not been granted). This problem, which 
is due to our choice of an input-output description could restrict the set of inputs to a 
particular scheduler. Yet all prefixes of histories in C must still be inputs to all 
schedulers realizing C. This is also true for all prefixes not in C that are minimal 
(their prefixes are in C). These will be the only inputs of interest in Theorem 3. 



24 



£o»tptr)atio»i 



rcH + 




Ql^MV^-AO 



Figure2.5 

Definition 13: The computational complexity of Q is the worst-case sum of the 
counts of all local computations by Q over all processes of Q. The communication 
complexity of Q is the worst-case count of all send instructions executed by all 
processes of Q.D 



Note that apart from the messages generated by the scheduler processes of the 
system there is also user defined communication, implied by transaction cross-edges 
(e.g. some action at site. 2 needs data from site 1). This communication is assumed 
free, since it is unavoidable, and can be used to pass information between scheduler 
processes at no cost. 

A scheduler Q is polynomial time bounded (or computationally efficient) if its 
computational complexity is bounded by a polynomial in n (i.e., n=|N|, N is the set 
of actions of 7). This means that all possible computation paths have computational 
complexity (number of local steps) bounded by a polynomial in n. 
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We may even augment the computation power of our schedulers if we allow 
them, in their local computation steps, to consult an oracle [11] for a hard 
computational problem (say an NP-Comptete problem). Many of our results will still 
hold for such schedulers. 

Finally in order to characterize communication complexity we define the 
following classes Af^b): 



Definition 14: For a prefix <J,o> of C and an integer b>0 we say that : 
<J,o>€A/ c (b) if there is a realization Q of C such that the total sum of send 
instructions executed at all processes of Q after <T,a> is b or less. 

Let b*(7) be the least b for which <T,0>€A/ c (b). A scheduler which achieves 
b*(7), for every T y is called communication-optimaLD 



Note that Mfi>)=0 if b<0 and il/^tycA/^b+l). This definition describes the 
communication used if both processes of the scheduler are started with initial 
information <T,a>. 

What Definition 14 says is that a priori information about the syntax of the 
transactions could be used to enhance the communication performance (worst-case 
number of messages used at run time) of the concurrency control mechanism. This is 
analogous to the conflict graph analysis used to improve parallelism in SD0-1 [2,3]. 
A communication optimal scheduler is the limit in message performance attainable, 
subject to a parallelism requirement C. 

In Section 22 we will show that our model is a simple generalization of the 
centralized case and that there exists a computationally efficient scheduler realizing 
SR. In Chapter 3 we will recursively characterize the classes Mfi>) and prove that 
there exists a communication optimal scheduler realizing SR. Finally in Chapter 4 we 
will examine the complexity of deciding whether a prefix is in A/^(b) and prove 
that, if NP*PSPACE, no scheduler can realize SR and be both computationally 
efficient and communication optimal. This will be true even if we restrict our system 
to two sites, and our transactions to sequences of six updates each. 
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2.2 Properties and Limitations of the Model 

The model presented in Section 2.1 consisted of extending the definitions of 
centralized concurrency control by introducing, where necessary, partial orders 
instead of total orders and by partitioning the conflicts according to sites. A more 
technical part was involved with defining the class of allowable distributed 
schedulers. We can now state the distributed problem we will examine: 



DCC : The problem of Distributed Concurrency Control is, given a set of 
histories C (which we can prove correct), find a scheduler, which realizes C and is 
efficient (in terms of both local computation and communication). 



Similarly to [25] we can prove: 



Theorem 1 : C has a computationally efficient realization iff the set of prefixes of 
C is in F (Le^ deterministic polynomial time). 

Proof : Since we can expend an indefinite amount of communication between 
the different modules of a scheduler, the problem reduces to the centralized one (one 
site gathers alt information and makes all decisions). Therefore the constructive proof 
of [25] is applicable. For arbitrary delays this construction gives us outputs in C + ; 
for delays Definition 12(b) is also satisfiedD 



Since the analysis we will be presenting deals primarily with the assignment of 
directions to edges of the conflict graph G(7) by a prefix <7,a> ( we need a 
characterization of realizable assignments (see Definition 10) 
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Lemma 1 : Given a conflict graph G(7)=(V,E,0,{>}}). An assignment of 
directions to a multiset X of its edges, producing the ordered mixed multigraph 
(V,E\X,A x ,{>i}) is realizable iff, 

(a) If ij € X and is directed from i to j and ik > 4 ij then ik € X. 

(b) Ax has no directed cycles (iii2*3— *n*l) sucn tnat: 
11*2 ^i2 W* ' l 2 { 3 ^B W- Vl ^il i l i 2- 

Proof : "only ir Given a prefix <!>> of a history let us first assign the direction 
(ij) to any edge ij in G(7), which corresponds to a pair of conflicting actions {p,q}, 
under the following conditions: 

(1) pCTj, q€Tj 

(2) p€a 

(3) if q€o then p> & q 

Obviously all histories, which have <J,«> as prefix resolve these conflicts in the 
same way. Moreover if an edge has not been given a direction then both its actions 
p'.q* are not in a. We can complete <T,a> with suffixes of histories that have p'.q' in 
both orders. This proves that the directions we have constructed are exactly those 
assigned by <T,o>. - 

Because of causality both conditions (a) and (b) obviously hold for the directions 
constructed above. 

"if' Given an assignment Ax we construct the following digraph (Vq,Aq) 

Vq (vertex set): 
If (ij)€Ax and ij corresponds to conflicting actions {p,q}, p€Tj then p€Vg. 
If peVQ, p€T£ then all ancestors of p in Tj belong to Vq. 

Ao (arc set): 
If p,q belong to the same Tj and p> Ti q then (pq)€AQ. 
If p,q correspond to ah (ij)€Ax then ftjqKAQ, 

Since (b) is true (Vq,A()) is acyclic and since (a) is true transaction precedences 
are respected. Thus (Vq,Aq) has the same nodes as some prefix and respects all its 
conflict resolving orderings ( see "only if part of the proof). By topological^ sorting 
the nodes of (Vq,Aq) we can produce the desired prefix.D 
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We will now characterize the serializable histories and prove that the prefixes of 
SR are polynomially recognizable (in P). 

For the model of actions we are using (i.e M t p :=x p ; Xp: = fp(tp,...,tq,...)) we say 
that action p reads a variable xfrom q in history h=<7>>, if x«=Xq=x and q is the 
ancestor of p closest to p in «. The reads xfrom relation in our model is always a 
chain of all actions p, for which x p =x. The chains for all x's give us the reads-from 
relation. It is easy to see that we can represent the reads-from relation for a given 
history h=<7>> as a directed multigraph D(h), with nodes corresponding to 
transactions and edges corresponding to edges of these chains (labelled by the 
variable read and the action reading it). In D(h) we can ignore arcs of the form (i,i) 
because we can deduce these from T. 

Since histories are program schemata, we have from standard schemata 
equivalence theory [25]: 



Proposition 1 ; Two histories hi=<7>i> and h2=<T,»2> are equivalent iff 
D(h]}=D(h2) (Le, they have the same actions and the same reads-from relation).a 



For other models of actions it is necessary to distinguish between live and dead 
transactions [25]. In our model, all transactions are live. Obviously for a serial history 
hg, D(r%) is acychc. 

The following theorem (an obvious generalization of the centralized case) is yet 
another variant of a veritable "folk" theorem [3,17,25,28,40]: 



Theorem 2: A history h is serializable iff it resolves conflicts without creating 
directed cycles in G(7). Similarly, a prefix has a serializable completion iff the 
already resolved conflicts do not create a directed cycle in G(7). 

Proof : Let D(h) represent the reads-from relation for h. If hsh s for h s serial 
then EXh)=D(h s ), which is acyclic. If D(h) is acyclic we can find a total order of 
transactions by topologically sorting it and then consider the serial history which 
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respects this total order on all processors. This serial history has the same D(h). The 
only difference from the centralized case is that D(h) can be partitioned into as many 
subdags as there are sites. 

It is easy to see that D(h) is acyclic iff G(7) with the assigned directions is 
acyclic. A scheduler, recognizing serializable interleavings and knowing of all 
requests (operating in a centralized manner), would arbitrate requests on-line by 
making sure that the assignment of directions to the conflict graph introduces no 
directed cycles. This can be done in polynomial time. Therefore there is a 
computationally efficient scheduler realizing SR.D 



. It is easily seen from the above analysis that histories with the same total orders 
on each site are equivalent and cross-edges are not needed for deciding 
serializability. These edges, between actions at different sites, can be used in relating 
histories and performance of distributed schedulers. 

Let us end this Chapter with a brief discussion on the properties of our model. 
The advantages of this model are: 

(a) generality : All models of transactions and schedulers proposed have the 
properties of our model. Variations in the format of transactions (i.e. defining 
separate read and write actions) do not affect the results that will be presented. 

(b) mathematical simplicitlv : All cases are treated uniformly (i.e. copy 
equivalence is just one more instance of the integrity constraints). All questions are 
reduced to questions on concrete combinatorial objects (e.g. conflict graphs). There 
are no hidden assumptions since the performance measures (parallelism, computation 
steps, messages) and the model of distributed algorithms are well-defined. 

(c) compatibility : The model is an extension of the centralized case. In Section 
5.1 we will be able to express distributed locking policies in the model, just as was 
done in the centralized case. 

(d) correctness : Serializability is not the only notion of correctness, but it is 
certainly the most generally accepted one. It is intimately related to the a priori 
information about the syntax of T. 
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On the other hand there are some disadvantages: 

(e) Restricting attention to the three measures of performance : We ignore goals 
which are important for distibuted systems but hard to treat mathematically (e.g. 
reliability of the update mechanism, which is usually handled by two phase commit 
protocols[14]). 

(f) The assumption that all syntactic information is known at run time: 
Information about transactions is not always available before the transaction is 
initiated. There is a whole spectrum of possibilities, between total syntactic 
information being known before run time (static case) and the completely dynamic 
case, in which information is acquired for each action separately as it is presented for 
execution. 

(g) The measure of parallelism used (Le., the size of the set C£H) is a crude 
approximation of the average user delay [25]. 

These disadvantages are shared by most formal work on database concurrency 
controL 
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3. Communication-Optimal Schedulers and Games 

We will now state and prove a theorem, which relates the structure of histories 
and their prefixes with the number of messages necessary and sufficient to achieve a 
performance C. 



3.1 A Recursive Characterization of Communication Complexity 

As defined in Section 2.1 the performance measure for parallelism C is a set of 
histories (i.e. CcH). In this section we require C to be a concurrency control principle 
(see Definition 11). Concurrency control principles are very natural classes of 
histories measuring parallelism (examples are serializability SR, and serial execution 
S). Let PR(Q be the set of prefixes of histories in C. Two properties of C are used in 
our recursive characterization of communication complexity. First, if C is a 
concurrency control principle, then for each. h€ C the only cross-edges (edges 
between actions at different sites) are defined by the transactions. Second, we have 
an efficient (polynomial time in n) test of membership of a prefix in PR(Q (for 
example, if C=SR Theorem 2 provides us with such a test). If no such test is 
possible, concurrency control is quite hopeless, even in the centralized case [25], 

Let us briefly review the notation used. A prefix is denoted as a pair <7», 
where T represents the transactions (a priori syntactic information) and o the order 
in which some actions were executed. We use o for <T,«> when there is no 
ambiguity about T. Also (fi/a\ denotes the prefix of fi that contains a and all actions 
of p at site i (the projection of/? at site i given o). So o is a prefix of fi and (fi/a\ 
Finally we use M^b), where Mfi>)Q PR(C), for the set of all prefixes <T,a> of C 
such that there is a realization of C which, when started with <T,a>, sends b or fewer 
messages. 
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Theorem 3 : Let C be a concurrency control principle, <T,a> a prefix in PR(C), 
and b a nonegative integer. Let i denote an index ranging over the site number 
i€{l,2}. Then the following are equivalent: 



(I) <T,a> € M£j) 



ai) v <m 



{'(l) <Tf> € PR(Q p) vi <r,(^/a)i> 

then] 
(2) vi <r,^/a)i> € PR(Q 1(4) 3i <T,(fi/a)j> € 



Less formally (II) reads as follows: 
"For all continuations. 0^2 of o such that m is a with some actions at site 1 added, 
and a 2 is a with some actions at site 2 added, and such that their least common 
continuation ft is not a prefix of C (while a lt a 2 are) the following holds: 
<T,ai>,<T,a 2 >£ Mfi>) and one of them is in MJfr-l)? 

We will first give an intuitive interpretation of Theorem 3 (which is illustrated in 
Fig. 3.1). Consider a scheduler, which realizes C, starts from <7,a> and receives input 
requests <Tft>. Each one of the scheduler processes i, i€{1.2}, can see (p/a\, without 
sending any messages. This is because process i (e.g. process 1 in Fig. 3.1), knows a 
(e.g. in Fig. 3.1), receives the actions of fi to be executed at site i (e.g. actions 4 and . 
5 in Fig. 3.1) and using the transaction-defined messages (e.g. action S needs data 
from action 6 in Fig. 3.1) can learn about some actions at the other site (e.g. actions 6 
and 8 in Fig. 3.1). . . 

A situation that forces communication is one where the projections of the input, 
that each process sees directly, seem correct (i.e. <r,^/a> 1 > € PR(Q) and therefore 
must be executed on-line to achieve the goal C, yet the real input could be incorrect 
(i.e. <J^> C PR(Q). For the example in Fig. ll, o=0 and there is a unique 
minimal "bad** continuation fi. We use ^ as a shorthand for (fi/a\, when there is no 
ambiguity. 

Theorem 3 tells us that these are the only cases for which we need 
communication between scheduler processes: furthermore to guard against such 
"bad" /Ts only one (p/a\ (say (fi/a\* or oj* for short) has to be in A/^b-2). The 
communication protocol is built in such a way, that the corresponding site i* will ask 



33 

for the approval of the other site in order to execute 04*. There is therefore a 
balancing of the send instructions among the two processes of the scheduler, with 
each send instruction guarding against a "bad" p. 

The rigorous proof of Theorem 3 is given below. In one direction it entails an 
adversary argument and case analysis. For the other direction we give an explicit 
recursive construction of scheduler processes that realize C, within the prescribed 
number of messages. The basic idea of this construction is the following: Let a^ be 
correct continuations of a and projections of an incorrect p. Let Qj (i=l,2) be a 
message-optimal protocol, given that «j has been executed. Then the Qj's can be 
combined to produce a Q that is message optimal, given that a has been executed. If 
Qj uses more messages than Qj, then the process of Q at site j will have the send 
instruction guarding against p. 
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Figure 3.1 

(a) Transaction system (u.v.w at site 1, x,y,z at site 2) (e.g. action 1 updates x) 

(b) Conflict graph (i.e. -conflicts at site 1, ~ — . = conflicts at site 2) 

(c) Illustrating Theorem 3. Above: prefixes. Below: assignments of directions. 
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Proof of Theorem : Let a t denote (fi/a\ Theorem 3 recursively characterizes 
<r,a> e Mfii), based on prefixes <7',a 1 >, <T,a 2 >, which properly contain < J,o>. The 
containement is proper because of conditions (1),(2) of the Theorem. The last actions 
of <T$> at sites 1 and 2 (p x and p 2 respectively) are concurrent and not contained in 
a. Consequently t^ containing p 1 and a 2 containing p 2 are not prefixes of each other. 
Note that in order to terminate the recursion we use the following facts: if b<0 then 
M { fjo)=0 and if h is a history in C then h€ MJ[0). For b=0 the statement of the 
theorem becomes: "<!,«> € A/^0) iff no prefixes <T£> exist satisfying conditions 
(1),(2) and (3)". 

"I=»IP W e will now prove that if p exists with properties (1),(2) and (3) and 
{<r t0l > C Mfi>)} V {<r,« 2 > i Ai^a)}..V {<r*&4 M£>m A<r,«2> < Af^b-2)} 

then <T,a> $ Af^b). This is .obvious if one of the two fiist members of the above or 
clause is true. If both are false but the third member is true we -will prove that 
communication involving two messages is forced, foetweert the execution of <T,a> 
and that of <J f o 1 > or <T,a 2 > for all schedulers realizing C. For this we will use the 
general specifications for a programming system as outlined in Section 2.1. 

Consider the following situation that the ptoctis of the scheduler at site 1 (site 2) 
can face. It receives requgst pi(p$ 4 while kriOwih^that certain requests <7;y> have 
been granted with r=«i"{Pi} fy=«2"{p2}). ^ ^ to decide whether to grant or 
delay p^(p 2 ). If it grants the request, then according to its local view of the input the 
result would be correct. Its local view of the input can be the actual input, that is it 
could be the case that the input history is in C, it has <T,a{> (<7*,a 2 >) as a prefix, and 
no other requests have been submitted at the other site yet Therefore the scheduler 
cannot delay p^^ for the purpose of waitings for some tare request submitted at 
site 1 (site 2). It has the following two options. First the process of the scheduler at 
site 1 (site 2) can either grant p 1 (pj) directly or after receiving a message from the 
process at the other site. Second , it can inform the other site of p]^ or it can 
withhold that information. These two Options expressed as sets of instructions in our 
programming system give rise to the only four possible cases for site 1 (site 2) to 
handle px(p 2 ). These are cases A1-A4 (cases BI-B4 are symmetric). 
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Case Al: if (input as seen at site 1 is in PR(C)) then grant p x 

In this case the process at site 1 does not wait or inform site 2 of its 
decision. 

Case A2: if (input as seen at site 1 is in PR(C)) then grant p x 
send (message to site 2) 

In this case the process at site 1 does not wait but informs site 2 of its 
decision. The message can potentially contain all available information at site 
1. The order of these instructions can be interchanged 

Case A3: wait (for message from site 2) 

if (input as seen at site 1 is in PR(Q) then grant p x 

In this case the process at site 1 waits for information from site 2, but 
does not send any information. Interchanging the order of these steps will be 
treated similarly with case AL 

Case A4: send (message to site 2) 

wait (for message from site 2) 

if (input as seen at site 1 is in PR(Q) then grant p x 

In this case the process at site 1 informs site 2 of its problem, and waits 
for an answer before proceeding. Any permutation of these steps also uses two 
messages. 

We will now reach a contradiction by examining two possibilities. 

(i) If either the process at site 1 uses the instructions of case A4 or the process at 
site 2 uses the instructions of case B4 then two messages are consumed in executing 
either <T,a 1 > or <J,o 2 >. Since we assume these prefixes belong to A/^b) and not to 
Mfiy-2) and they are prefixes of <T,a>, we will have to use (b-l)+2=b+l>b 
messages at least to achieve our performance goals starting from <T,a>. 

(ii) For all other combinations of cases of instructions we will also find 
contradictions. 

Using case Ai instructions for site 1 and case Bj instructions for site 2, for 
ijG {1,2}, we obviously have situations where the input prefix is <77?HPR(C) and 
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is (incorrectly) executed without rearranging requests. 

In any one of the remaining combinations either site 1 uses instructions of case 
A3 or site 2 uses instructions of case B3. We will reach a contradiction using A3 
instructions (B3 is symmetric). Let the input history h* be in C and have <T,ai> as 
prefix. When the request for p 1 will be submitted to the process of the scheduler at 
site 1, the process will wait for a message from the other site, which will determine its 
decision ( granting pj or making it wait for future requests from other transactions). 
But when actions of <7,a 1 > are being executed at site 2 no such message can be sent 
This is because according to site 2 both <T,«i> and <7> 2 > are possible (proper) 
continuations and decisions cannot be made excluding one or the other. So the 
message site 1 is waiting for will be sent when descendants of <r,aj> arrive at site 2. 
Thus we force action pj to wait for some action which is not its ancestor in h*, and 
therefore h*. although in C, is not executed on-line as required by Definition 12 of 
Section 2.L 



"II=»r : Under the conditions of the theorem we will construct a realization of 
C achieving the desired performance. That is we will present a scheduler, which will 
consist of two processes (i.e. LOCALSCHED i (<7',a>,b), i=l,2) and recognize on-line 
all histories in C with <T,a> as prefix, without executing more than b send 
instructions in the worst case. The algorithm is written in a programming system with 
the capabilities outlined in Section 2.1. 

The LOCALSCHED processes (see Fig. 3.2 for i=l) communicate with 
transactions and wife each other using messages. The messages received by a process 
are buffered in a FIFO queue. The variables that the scheduler processes use for 
recording the state of the system are the state variables s v q, tj, p v and b. The 
variables m t (modes) are used to synchronize the* two processes, so that when one 
process asks the other a question it expects an answer before examining other 
requests. The execution of send instructions is controled by the conditions of 
Theorem 3. The procedures Grant; grant requests. Finally the procedures Delayj, 
De!ay*j handle the cases where the input is discovered to be incorrect Let us explain 
the above features in some detail. 
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LOCALSCHED 1 (<r,a>,b) 

1. s 1 :=<7,a>; r 1 :=<f,a>; t x :=0; pj:=0; m^normat, 

2. when queue nonempty do • 

3. if m 1 = normal 

then M: = first message of queue (delete it from queue); 

else wait (for message of type Q or A); M: = first such message; 

4. (Based on M assign) 

Sjr^state of 1); 

r x := (state of 1 that is also known by 2); 

Pi:=(set of pending requests, at most one per site); 

tj;= (state of 1 resulting if pending requests were granted); 

5. (Respond to message M) do one of three cases (R AQ); 

6. od end 



case R: 

if t x C PR(Q then Delay^) else 

if 3^ S.L {t^ifi/Tfo} A {/J€PR(Q} A {(fi/rfo&RiC)} A {^€^0-2)} 

then ra 1 =wflr//; send ^.Q^^; 

else s^t^ Grant^^j); 



case A: 

if p x is in s x then Grantjfo^); LOCALSCHEDjfsj.b^); else Delay*^^); 



case Q: 

if t x € PR(Q then s^ttf 

if mj= normal then send <2^\,0,s i >; 

if t x € PR(C) then Gran^p^); LOCALSCHED^.b^); else Delay*^^); 



Figure 3.2 LOCALSCHED at 1 
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(a) Messages : The messages received by the scheduler process at site 1 (for those 
received at site 2 interchange 1 and 2) have the following format, (i.e. there are three 
types of messages): <1, type, requested action, state at site 2>. 

R (for type= request). This is a message from a transaction to the 
scheduler process at site 1. It contains a request for an action p at site 1. State 
information about site 2 is included (else it is 0), when data from site 2 is 
necessary to compute p. This happens when an ancestor of p, in the 
transaction of p, has been executed at site 2. Then the transaction defined 
message can be used to transmit information about the state at 2. Examples of 
such messages are <l,R,p,S2> or <l,R,p,0>. 

Q (for type = question). This is a message from the scheduler process at 
site 2. This process needs site l's approval in order to decide whether to grant 
some request p, when it is at state S2- An example for such a message is 
<l,Q,p^2>. 

A (for type = answer). This is a message from the scheduler process at site 
2 answering a type Q message of the process at site 1. Site 2, having full 
knowledge of the system, determined whether the pending request at site 1 
should be granted. All necessary information has been incorporated in the 
state at 2. An example .for such a message is <1,A,0^>. 

(b) State : The state of each LOCALSCHEDj (sj) is the prefix in PR(C), that the 
process at site i knows has been executed. For example with C-SR the state is G(7) 
(see Definition 9 Section 2.1), with a partial assignment of directions that can be 
realized by a prefix. For this case correctness is guaranteed if acyclicity is maintained 
in the directed part of the conflict graph. In addition to s k LOCALSCHED s keeps an 
estimate of the state of the other scheduler process q. With this estimate it keeps 
track of the part of Sj that the other site might not have heard of. Every time a 
message is received or a request is granted Sj and r 4 are consistently updated. Finally 
Pi is used to store pending requests and tj the state that would result if these requests 
were granted. The variable b keeps count of site i's estimate of the number of send 
instructions executed or the number of messages of types Q and A. 

(c) Synchronization of the scheduler processes : The modes (nij) are binary 
variables used by the scheduler processes to guarantee that every question is 
answered. A mode is cither normal, indicating that new requests are processed, or 



40 

wait, indicating that the process at i needs an answer in order to decide on pending 
requests and handles no requests until it receives one. As can be seen from Fig. 3.3 a 
type A message is never received when the mode is normal. The two sites never 
deadlock (wait for each other indefinitely), because of the effect of A and Q type 
messages on the mode. 

(d) Communication Protocol : Every incoming request is examined (if the mode 
is normal) and if it renders the local state incorrect it is delayed. If its execution leads 
to a correct local state (tj) we determine whether send instructions should be 
executed. We first examine whether it is possible for a malicious adversary to give as 
input to the other site requests, also leading to a correct local state for the other site, 
but such that the total input is incorrect If this is not possible the request is granted. 
If, on the other hand, this is possible some strategy has to be worked out for 
communication. In that case we also test whether tj€ Mfi>-2). If this is not so the 
request is granted without informing the other site. If this is so, site i sends a Q 
message in order to ask for the other site's permission to proceed. If it receives a go- 
ahead then we notice that, after sending two messages, both local processes are in 
feet LOCALSCHED i (s new ,b ncw ) with common new state and new message 
parameters. This makes it possible to give an inductive proof of correctness. 

Three decision questions are actually answered: 
ft€ PR(Q}? 

{does a "bad" p exist with tj= (projection of fi at i given t)}l 
{1-e A/^b-2)}? 

(e) Granting requests : When LOCALSCHEDj decides to grant a request it 
allows the transaction to update the variable of the requested action. Also if this 
transaction will send a message to some other site it will incorporate in that message 
the local state s$. AH this is achieved using Grantj(pj,Sj) (i.e., if pj contains a request 
for an action at site i, then let the transaction of this action perform its update and 
use s 4 in any messages it sends to the other site, else no operation). 

(0 Delaying requests : If a request is received when the mode is wait the request 
remains in the queue and will eventually be processed in its order of arrival. It is 
delayed at most by the communication delay of a Q and an A message. If on the 
other hand the scheduler discovers that the pending requests (at most one at each 
site) would lead to an incorrect execution then it delays one pending request. There 
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are two cases: 



For only one site t$ PR(C). Then the process at i delays the pending 
request at i by putting it at the end of its queue (busy waiting). The scheduler 
continues functioning as if the input were correct This is accomplished using 
Delays) (i.e., if pj contains a pending request at i, then put it at end of i's 
queue, else no operation). 

Both sites discover that t^ PR(C). This happens through an exchange of 
a Q and an A message (one pending request at the site that sent the Q 
message) or of two Q messages (one pending request at each site). In this case 
Detay*j(pi^j) is used One pending request is delayed. If there are two 
pending requests the younger one is delayed and the older one granted. Since 
consistent timestamps [19] can always be assigned to events in a distributed 
system,, there is no problem in determining the younger of the two pending 
requests. Both processes of the scheduler know that the input is incorrect and 
that a common prefix s* has been executed. In this case no more send 
instructions have to be executed to realize C (see De£ 12 Section 2.1), because 
a predetermined correct completion of s* can be executed. 



<*A~> 



<w>/<%<v-> 



<v«,~>/ 




<J,A,...> Or<l,<3,.-> 



<H*r>/&A~> 



Figure 3.3 The mode at 1 
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Let wv/e/z(<r,a>)= {number of actions in 7)-{number of actions in o} 

For the conditions (I) and (II) of Theorem 3 we have proven that (I) implies (II). 
We will prove that (II) implies (I) by induction on invlen. 

Induction hypothesis: For mv/ew(<r,a>)=j we have that, if (II) is true then (I) is 
true and moreover after <7» has been executed LOCALSCHEDj(<7*,a>,b), i=U 
realizes C and sends at most b messages. 

For j=0 this is trivially true since <7,o> is a history in C, there are no more 
requests left and <T,a>£ Af^C Mfi>). So we assume the hypothesis is true for j<j* 
and (II) is true for <T t a> with invlen j* and some b (that depends on <T,a>). Since 
we have to prove (I), we have to exhibit a realization of C that after <T,a> sends b or 
fewer messages. We consider the scheduler Q that realizes C by submitting all 
requests to one site, except when the input prefix <T,a> has been executed From 
that moment on Q uses LOCALSCHEDi(<r,a>,b), i=U. We need only consider the 
operation of the scheduler after <J,a>. There are two cases: 

Case A: h€ C. First we will examine the case where no send instructions are 
executed and then the case where some are executed. 

A.1: No send instructions are executed. Then the output has to be h and 
no request p waits for the execution of a request which is not an ancestor of p 
in h (Def. 12 is satisfied). This is because on every request p the test (Is new 
state in PR(C)?) is always true and involves only local computation. The 
reason for this is that by definition of C, as a concurrency control principle, h 
has no crossedges that are not forced by the transactions. Thus the part of the 
input each scheduler sees is automatically a prefix of h. Therefore it is 
unnecessary to wait for a message from the other site to verify that what the 
local scheduler sees is indeed a prefix of PR(C). Finally note that b>0. 

A.2: Two or more send instructions are executed (the first two resulting in 
an exchange of a Q and an A message or two Q messages). Up to the first 
exchange the previous arguments, of A.1, hold. In order to execute send 
instructions a prefix fi must exist that satisfies the conditions (I),(2) of 
Theorem 3 and has the new state tj of a scheduler process as a projection. 



Also tj must be in Mfii-2), which can be decided since invlen(t^<j* (by the 
induction hypothesis and the "only if 'part of the proof)* Finally since 
A/^b-2) is not empty b>2. After the exchange LOCALSCHEDi(s new ,b-2) 
i = 1,2 is used and we can invoke the induction hypothesis since 
//ivfen(s new KJ*- So h is ourputed on-line with at most 2+(b-2) send 
instructions after <T,a>. 

Case B: hi C. First we will prove that the output of the scheduler Q is a history 
in C (B.l). Finally that no more than" b send instructions are executed (B.2). 

B.1: Let the output (the granted requests) be a history h* not in C. Then 
it has (perhaps more than one) prefixes, called y, such that yi PR(C), y has 
<7>> as prefix and y is minimal (all its prefixes are in PR(Q). Let us call qj 
and q 2 the final actions of y, not in <J,a>, which are at sites 1 and 2 
respectively. At least one of them must exist Without loss of generality let site 
1 grant q 1 before site 2 grants q 2 (if y has a q^. Since y is minimal we have 
that either q 2 does not exist, or qj is an ancestor of q 2 in h* or q A and q 2 are 
concurrent in h* and then y is an example of a fi prefix of Theorem 3. If q 2 
does not exist then, when the process at site 1 receives q^ it cannot grant it, 
because it sees from the information available to it that the result would be 
incorrect If q 1 is an ancestor of q 2 in h*, (that is there is a transaction 
crossedge making q 1 an ancestor Of q 2 in h*) then site 2 knows qj has been 
executed (through a transaction defined message) and delays q 2 . Finally if y 
is an example of a p prefix of Theorem 3, then some exchange of two 
messages has to take place before q 2 and qj are granted. By (II) one of the 
projections of y is in A/^b-2), b>2, and thus, before both requests are ganted, 
one of the processes sends a Q message. If this exhange results in 
LOCALSCHED^Sng^b^) i= 1,2 being initiated we can use induction to 
argue that y cannot have been executed. If the exchange results in Delay*j 
i=l,2 being called, both processes output a correct predetermined completion 
of a common state s*. Thus we conclude that 7 cannot have been executed 
and the output of the scheduler is always in C. 

B.2: Since b>0, if no send instructions are executed we have no problem. 
If send instructions are executed* let us look -*at the first round of 
communication (two Q messages or one Q and one A message). If as a result 
of this exchange LOCALSCHED;(s ncw ,b-2) i = l,2 is initiated with s ncw €. 
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A/^b-2), we know that wv/e/i(s new Kj* and b>2 (See A.2). By induction no 
more than b-2 send instructions are used after this and again our goals are 
met. If as a result of this exchange Delay*j i=l,2 is initiated at both sites 
(which is possible since the input h€ C), then we know that b>2. This is 
because (II) holds and a "bad" p exists. After both sites call Delay* j they have 
a common state s* and use no more send instructions, because the completion 
of s* is predetermined and can be recognized locally. Thus no more than b 
send instructions are ever executed. 

This completes the proof of Theorem 3.D 



Corollary 3.1 : If C, a concurrency control principle, has a computationally 
efficient realization, then it has a communication-optimal realization, which can be 
implemented in space polynomial in n (n= number of actions of 2). 

Proof : It follows from Theorem 1 that, since C has a computationally efficient 
realization, recognizing if a prefix is in PR(Q can be done in polynomial time in n. 
Consider the following realization Q: 

Q: (1) Each site computes b* from T, where b*=b*(7)=min{b/<r,0>€ A/^b)} 
(2) Site i uses LOCALSCHEDj(<J,0>,b*) 0=1.2) 

By the constructive proof of Theorem 3 Q is a realization of C using the 
minimum (b*) number of messages. From this proof we have that four 
computational tasks are performed by LOCALSCHED. These are: 

(a) Given t, does t€ PR(Q? 

This can be performed in polynomial time (and therefore space). 

(b) Given t€ PR(Q, i*j, and r^t, is there a p such that 
{t=Cff/r)i} A {fi€ PR(C)} A {(/*/r)j€ PR(Q}? 

This can be performed in nondeterministic polynomial time (and therefore space). 

(c) Given t€ PR(Q, b>0, does t€ Affii-2)! 

Using Theorem 3, the polynomial characterization of PR(C) and the theory of 
alternation {5], we have that both this task and step (1) of Q can be implemented in 
polynomial space. 
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(d) Finally if Q discovers that the input is incorrect and Delay* is called at both 
sites then a correct completion of s* can be efficiently computed. This can be done 
based on a predetermined ordering of the actions and the test of membership in 
PR(C). 

This completes the proof of the existence of a communication-optimal scheduler 
realizing C in polynomial space and exponential timed 



We will end this section with some comments on Theorem 3. 

(1) Message lengths : Let us examine the length in bits of the messages sent If 
|7|=n there are at most n! states and in order to uniquely code a state we need 
0(nlogn) bits. Also we never send more than 2n messages. In the proof of Theorem 3 
we have used an inefficient format for messages <...,Sj>. Although for clarity of 
presentation we used Sj (0(nlogri) bits) in our messages, we could have as well used 
Sj\Tj (ije., each site will hear of every action at most once). Thus in total O(nlogn) bits 
will be used in the worst case. 

(2) More than two sites : The two site case, while being the simplest distributed 
configuration is sufficient for the results of Chapter 4. If more sites are used and the 
mode of communication is a broadcast mode, Theorem 3 can be easily generalized. 
On the other hand a network of sites makes optimal communication a more difficult 
problem, since it implicitly adds the problem of appropriately routing the messages. 

(3) Persistency: We have, examined schedulers that realize C and consist of two 
processes, one at each site. Each of these processes knows of some pending requests 
and a prefix of a history in C that has been executed (its state). 

We call such realizations of C persistent if whenever a process i discovers that 
the execution of a pending request pj would make its state Sj incorrect, it delays pj 
indefinitely and proceeds as if only the requests in Sj had been submitted. 

If PR(C)e F there are persistent polynomial time schedulers realizing C, as is 
obvious from the proof of Theorem 1 and [25]. On the other hand the scheduler of 
Corollary 3.1 is not persistent For some incorrect inputs Delay* is used. This is 
because persistency requires that messages are sent even after the input is discovered 
to be incorrect To illustrate this suppose our scheduler starts with <T,a>£ A/^b) and 
receives a "bad" input <Tjl> with projections <7',a 1 >6 Af c (b-2) and < T,a 2 HM c (b-2). 
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It is possible for <r,o 2 > to have been executed when the scheduler, at the expense of 
two messages, discovers the input to be incorrect If we want our scheduler to be 
persistent, starting from <T,a 2 > it has to use more than b-2 send instructions. 

This difference between on-line computationally efficient and on-line 
communication efficient algorithms, which accept the same strings, arises because of 
the nature of resources we are trying to optimize. In one case we wish to achieve 
performance C at asymptotic computation cost 0(n k ), in the other at fixed (say n/15 
or 200) communication cost 

From the proof of Theorem 3 it is easy to see that 
"<T,a>£ Mfi>) iff there is a persistent realization of C, which if the input b in C 
sends at most b messages after <T t a>". 



We have related communication complexity of schedulers achieving parallelism 
C, with the computational problems <J,«>€ MJtffl (which are in PSPACE). 

If the input history is in C and <7,0>€ Mfi>) a user's delay D is bounded by: 
^communication delay/message) > D ^0. 
If the input history is not in C there is a user who has to wait for other users. 

The approach of Theorem 3 and the formulation of the scheduling problem are 
pretty much independent of concurrency control and serializability. The application 
tx> databases provides practical motivation and analytical tools (ie., mixed ordered 
multigraphs). In fact the entire methodology can be extended to distributed on-line 
computation of combinatorial functions of two integers, which in a distributed 
environment are stored at two different sites [38]. 
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3.2 Games related to Distributed On-line Computation 

In this section we will define decision problems for the sets of prefixes, which 
were recursively characterized in the previous section. 

Distributed scheduling is related, using M^b), to a game on prefixes, PREFIX, 
whose rules are displayed in Fig. 3.4. In this game Player I corresponds to a 
malicious adversary who wishes to force communication. His move is a "bad" 
continuation p of the current position a. Player II corresponds to the two 
cooperating scheduler processes. Each one of his choices i* indicates, which of the 
two processes has the responsibility of guarding against the "bad" continuation)? (by 
questioning the other process before proceeding). Player I wants to prolong the game 
as much as possible, whereas Player II tries to bring it to an end as soon as possible 
(other than that there is no winner or looser). 

From Theorem 3 we can deduce the following property of communication- 
optimal realizations of C: 



Corollary 32 : The minimum number of messages sent by a communication- 
optimal realization of C equals the length of PREFIX(<r,0>) if both players play 
optimally (we call this the mihimax length of PREFIX). 

Proof : Follows from Theorem 3 and the theory of alternation [5]. Note that 
although in general we define PREFIX from an arbitrary initial position <T,oq>, we 
are in fact interested in a o =0. T represents the static (a-priori) information on 
transaction schemata, that is used to optimize communication. Thus {<T,a>t Mfl>)l} 
is equivalent to {Is the minimax length of PREFIX(<7,a>) greater than b ?}.□ 



In the following section we will analyze the game PREFIX for C=SR. If we 
choose serializability (SR) as our concurrency control principle the board position 
becomes the conflict graph G(7) with some of its edges directed. The moves of 
Player I become choices of directions to undirected edges of G(7). Much insight into 
PREFIX in this case is gained by studying a game played on mixed graphs called 
CONFLICT and displayed in Fig. 3.5. This game is our departure point in the 
PSPACE-Complcteness proof, given in the next section. 
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PREFIX(<r,oo>) 

Initial position: For fixed C, a prefix <T,ciq> 

Position before player Is move: A prefix <T,d> 

Player Vs move: Select a prefix <Tfi> such that 

(1) fi is a continuation of «, with projections ai*=(fi/a\ i=l,2 
aj^2 are prefixes of C 
(3) fi is not a prefix of C 

Player Ws move: Select i*€ {1,2} and set «:=<*!• 



Players I and II take tarns moving. Player II always moves when I does. 
Player A goal is to prolong the game as much as possible. 
Player ITs goal is to end the game as soon as possible. 



Figure 3.4 
The game PREFIX 
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CONFUCTCGq) 

Initial position: A mixed graph Go=(Vq,Eo,Aq) 

(Eq partitioned into "red" and "green") 

Position before player Vs move: A mixed graph G=(V,E,A) 

Player Is move: Select an assignment of directions (A x ) to an XcE such that 

(1) R(H) is the "redTgreen") subset of X 

(2) A R UA, A H uA have no directed cycles 

(3) A X UA has a directed cycle 

Player II's move: Select Y€ {R,H} and set E:=E\Y and A:=AuA Y 



Players I and II take turns moving. Player II always moves when I does. 
Player Vs goal is to prolong the game as much as possible. 
Player Ws goal is to end the game as soon as possible. 



Figure 3.5 
The game eONHiCT 
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CONFLICT+fGo) 

Initial position: An ordered mixed multigraph G =(V ,E ,A ,{£ i }) 

(Eq partitioned into "red" and "green") 

Position before player I*s move: An ordered mixed multigraph G=(V,E,A,{>i}) 

Player Vs move: Select a closed assignment (A x ) to an XcE such that 

(1) A x has projections A^^Ax 8 

(2) A x r uA, Ax*UA have no directed cycles 

(3) A x uA has a directed cycle 

Player II's move: Select y€ {r,g} andsetE:= E\(edges in A x >) and A: = AU A^ 



Players I and tl take turns moving. Player 11 always moves when I does. 
Player Vs goal is to prolong the game as muck as possible. 
Player lis goal is to end the game as soon as possible. 



Figure 3.6 
The game CONFLICT* 
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The game CONFLICT abstracts, in the legal moves of Player I, only the rules of 
PREFIX derived from an unordered conflict graph (fi has to create a cycle in the 
conflict graph, while (fi/a\ i= 1,2 should not). In fact the assignments of directions to 
edges of G(7) in PREFIX should also correspond to prefixes fi and (p/a\ i=l,2 (see 
Lemma 1, Section 2.2). CONFLICT can obviously be played on multigraphs with no 
modifications of its rules. 

We will now generalize the game CONFLICT to CONFLICT + (see Fig. 3.6), 
where in addition to the rules of CONFLICT a precedence rule is observed. 

The input to the new game CONFLICT + (G) is an ordered mixed multigraph 
G=(V,EA»{>i}). (V) is the vertex set, (E) is the multiset of undirected edges 
partitioned into "red" and "green", (A) is the multiset of directed edges and {>j} are 
partial orders (e.g. all undirected edges incident at node i form a partial order ^j). 
All conflict graphs (see Def. 9, Section 2.1) are such constructs. If A*0 some 
conflicts have been resolved and the >j's correspond to transaction partial orders. 



Definition 15 : Given an ordered mixed multigraph G=(V,E,A,{>j}), and an 
assignment (A x ) of directions to a multiset of edges XcE, we call this assignment 
closed (with respect to G) when: 

If ij€ X and is directed from i .to j and ik >j ij then ik€ X.D 



Given a conflict graph G(7) and an assignment of directions to some of its edges 
(A x ), that has no directed cycles, then A x is realizable by a prefix in SR iff it is 
closed. This follows easily from Theorem 2 and Lemma 1 (see Section 22). 

Let the undirected edges of G be partitioned into "red" and "green", and let A x 
be a closed assignment of directions to XCE It is easy to see that the following 
closed assignments are uniquely determined. They are called the projections of A x . 

A x s (where i="red" or "green"): 
a) A X |C A x 

(2) A x ' is closed 

(3) all i edges of X are given directions in A x ' 
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If {>il become the empty partial orders for every node, CONFLICT" 1 " becomes 
CONFLICT (i.e., X=RuH, A x r = A R , A x 8 = A H ). The real interest of 
CONFLICT* is its relation to PREFIX. A prefix <T,a> in PR(C) determines a 
unique mixed ordered multigraph G a (7) (see Def. 10, Section 2.1). In the next 
section (Lemma 2, Section 4.1) we will show that for C=SR, PREFIX(<7») and 
CONFLICT + (G a (7)) are equivalent An example of CONFLICT, where an optimal 
game leads to four moves is presented in Fig. 3.7. 
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We will close this section with a brief discussion of an important special case of 
the question {<J,o>€ Mfi))l} namely b=0. This problem is obviously in NP, 
because all we have to do is guess a prefix satisfying conditions (1),(2) of Theorem 3 
and check these conditions in polynomial time. 

In the next section (Corollary 4.2, Section 4.1) we will prove that 
{<7,«>€iW c (0)?} is NP-Complete. This leaves us with the problem {<7,0>€ MJQy!}. 
We say that the conflict graph G(7) of a transaction system T contains a mixed cycle, 
if it contains a cycle with edges e-^ and ^ where ej corresponds to a conflict at site 1 
("red") and ej to a conflict at site 2 ("green"). 



Corollary 3.3 : For C=SR, if G(2) contains no mixed cycle then <J,0>€ M c (0). 
This is also a necessary condition, whenever the transactions' in T have no 
crossedges. 

Proof : The sufficiency is obvious from the characterization of SR and conditions 
(1),(2) of Theorem 3. The necessity for transactions with special structure is easy for 
two transaction systems. For more transactions we can use a straightforward 
induction on the number of transactions (nodes of G(7)).D 



For general transaction systems T and C=SR, the complexity of determining if 
{<J,0>€ Af^O)?} is an interesting open question. For example all systems in Fig. 3.8 
are in A/^O), yet their conflict graphs contain mixed cycles. 
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Figure 3.8 G(7)'s for <T,0>£ M^Q). 
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4. The Complexity of PREFIX 

This chapter contains our main result, which is an analysis of the game PREFIX 
for C=SR. 

4.1 PREFIX is PSPACE-Complete 

We will now prove the following theorem: 



Theorem 4: Let C=SR. For input T and b^O, determining whether the 
minimax length of PREFIX(<J,0>) is greater than b is PSPACE-Complete. 



All the games we will examine in this section are in PSPACE. This follows 
easily from the analysis in Chapter 3. Therefore we will present only the reduction of 
a well known PSPACE-Complete problem to PREFIX. This is the problem QBF 
(i.e., what is the truth value of a quantified boolean formulaXll.33,34]. 



QBF: 

Input : A quantified boolean formula I n of the form: 

^ x l Vx 2 3x 3— 3x n-l Vx n F(Xj,X2,...,X n ) 

where F is a boolean formula without quantifiers in 3CNF (3-conjunctive 

normal form) of the variables Xj,...^ n (n=even). 

Question : Is \ true? 



QBF can be viewed as a game between two players, the 3-player and the V- 
player. These players take turns assigning values to the variables in the order these 
variables are quantified in I n (i.e., from left to right). First the 3-player assigns a 
value to x lt then the V-player assigns a value to x 2 etc. The 3-player wins if the 
values assigned to the x/s i=l,...,n make F(x lf x 2 ,...,x n ) true, otherwise he looses. The 
3-player has a winning strategy iff l n is true. This PSPACE-Complete problem is 
used in most reductions to games, 15,11,33,34,8,29]. 
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Another game on boolean formulas used in our proof is AE-QBF. This is similar 
to QBF only the V-player makes all his moves before the 3-player. 



AE-QBF : 

Input : A quantified boolean formula I n of the form: 

VX2VX4...Vx n 3Xj3X3...3X n _i F(Xj,X2,...^C n ) 

where F is a boolean formula without quantifiers in 3CNF (3-conjunctive 
normal form) of the variables X|,...,x n (n=even). 
Question : Is I„ true? 



AE-QBF is Ylf -Complete, where n/ is a class of the polynomial time hierarchy 
[33,11] corresponding to one V3 alternation (see Fig. 4.1). 



TSTACE 




co^Pa^r 



for all k£0 



Figure 4.1 
The polynomial time hierarchy 
P Y ={L: there is a language L'€ Y s.t L is P- time Turing reducible to L'} 
W Y = {L: there is a language L'€ Y s.t L is NP- time Turing reducible to L'} 
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Our reduction of QBF to PREFIX will proceed in four parts, which we outline 
below from (I) to (IV). 

(I) We show that CONFLICT, as defined in Fig. 3.5, is nf-hard . We 
accomplish this by reducing AE-QBF to CONFLICT in Lemma 1. The input graphs 
to CONFLICT are mixed (i.e. they may contain directed edges). 

(II) We generalized CONFLICT to the game CONFLICT*, that has in 
addition to the rules of CONFLICT a partial order on edges incident at a node. The 
definition is such that all possible conflict graphs G(7) can be inputs to 
CONFLICT*. In Lemma 2 we prove that the game PREFIX (for C=SR) is a 
special case of CONFLICT*. 

(III) We prove that CONFLICT* is PSPACE-Complete, even when the input 
is a graph without directed edges. We accomplish this in Lemma 3 using many of the 
constructs of Lemma 1. 

(IV) Finally we prove that PREFIX(<r,0>) is PSPACE-Complete by showing 
that the graphs in Lemma 3 are in fact conflict graphs for some transaction system. 

In Lemma 1 we will examine the game of CONFLICT (see Fig. 3.5). Its input is 
a mixed graph G=(V,EA), where E is partitioned into "red" and "green". Player I 
picks an assignment of directions for a "red" subset of E(A R ) and for a "green" 
subset of E(A H ). The choices he makes must be legal (ie. AuA R AuA H have no 
directed cycles, AUA R uA H has a directed cycle). Player II chooses "red"("green") 
making the new directed board position AuA R (AuA H ) from A. Player I wants to 
make the game last and Player II wants to terminate it 

The direction of an undirected edge e can become fixed during the game in two 
ways. First if Player I chooses e as part of A R (A H ) and Player II chooses 
"redT'green"). After this e becomes part of A, the directed section of the board 
position. On the other hand, even if e has not become part of the directed (A) before 
Player I makes his new move, it is possible for A to contain a directed path between 
the endpoints of e-. Now e's direction is fixed, because it can only be used in one 
direction, if Player Ts moves are to be legal. It is easy to see that if a move by Player 
I is legal A R (A H ) must contain edges, whose directions have not been fixed. Because 
of this observation the following fact is easily seen to be true. 
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(0) IfG has z "green" edges CONFLICT(G) lasts at most 2z moves. If Player I 
makes a move with two "green" edges, whose directions have not been Fixed, a move 
of "green" by Player II would consume two "green" edges. Moreover if Player I 
makes a move with exactly one "green" edge (e), whose direction has not been fixed, 
then no matter what the response of Player II is e's direction becomes fixed 
(i.e., either e becomes part of the new A or a path is included in the new A 
connecting the endpoints of e). 

We will use the notation MN for an undirected edge and (MN) for a directed 
edge from M to N. Similarly MjM^.-Mt will be an undirected and (M^^.-Mj^) a 
directed path from Mj to M k . 
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Lemma 1 : Given a mixed graph G and a nonnegative integer b, determining 
whether the minimax length of CONFLICT(G) is greater than b is nf-lmrd. 

Proof : For an arbitrary instance I n of AE-QBF we construct the mixed graph 
G(I n ) using the rules (a) to (d) below. We will prove that I n is true iff the game 
CONFLICT can last more than n moves on G(I n ). 

(a) For every existentially quantified variable x if i=l,3,...,n-l in I n a copy of the 
graph in Fig. 4.2(c) is included as a subgraph of Gi}^. This subgraph contains 6 
directed edges and 2 "red" undirected edges, namely ip i (labelled with 1) and F^ 
(labelled with 0). Actually this is the graph of Fig. 4.2(b) without nodes Aj.Bj.Mi.Nj. 
These are the 3-subgraphs. 

(b) For every universally quantified variable \ v i=2,4,...,n in I n a copy of the 
graph in Fig. 4.2(a) is included as a subgraph of G(I n ). This subgraph contains 6 
directed edges, 1 "red" undirected edge T^ (labelled with 1) and 1 "green- 
undirected edge FjEj (labelled with 0). These are the v-subgraphs. 

(c) For every clause of the 3CNF formula of I n (Le. F^^...^) * copy of the 
graph in Fig. 4J is included as a subgraph of G(I n ). This subgraph contains 35 
directed edges and 21 "red" undirected labelled edges. For the kth clause (uVWw), 
(starting from left to right in F(x lt x 2 ,...,x n )), which has literals u,v,w, we have seven 
possible paths from C k to <\+\. Each one of these paths corresponds to an 
assignment of values to the literals u.v.w, of the clause, which makes the clause true 
(i.e. only assignment 000 is excluded). The assignment can be read from the labels of 
"red" edges on the path. Every one of the three columns, of seven labels each, 
corresponds to the possible values of one literal. Also for one literal (say u) four 
directed edges go to F u and three to T u , depending on the label of the "red" edge 
from which the directed edge starts. We call these directed edges (to F u or T u ) 
backedges. We use the following rule: 

u= Xi m. F u =.Fj and T u =T r 

u=-«x i =*• F u =T r and T u =Fj for x t a variable of I n 

The backedges are connected so that if the labels correspond to values of 
variables and literals a backedge connects two undirected edges iff their labels are 
inconsistent (e.g. x 1= l. u=-Mk v a backedge connects T^ and "red" edges with 
labels 1 in the column of u). These are the clause-subgraphs. 
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(d) The graph G(I n ) is constructed by identifying nodes with the same name. 
That is S p 's of 3-subgraphs with S q 's of V-subgraphs if p = q. Also F p 's or T p 's of 3- 
and v-subgraphs are identified with F q 's or T q 's of clause-subgraphs if p=q. We 
also identify C^S,^. If there are m clauses in I n we add the "green" edge 

s l c m+l- 

An example is provided in Fig. 4.4 for the AE-QBF: 
I n =vx 2 Vx 4 3x 1 3x 3 (x 1 Vx 2 Vx 3 )A(x 2 Vx 4 V-ix 3 ), if we delete the nodes A^Bj^Nj, 
i=l,3 and A^A^A^. We will first examine some ample properties of G(I n ). 

Q Let GflJ contain z "green" edges. Then CONFL[CT(G(IJ) can last 2z-2 
moves and at most 2z moves. Here z=n/2+l. The game can last 2z moves, because 
of observation (0) (right before Lemma 1). It can last always 2z-2 moves, because 
Player I can play z-1 times on the z-l=n/2 mixed cycles (F^TPiFj), i=2,4,...,n. His 
moves are legal no matter what the response of Player IT is. 

(2) Let (S^-C^+j) be any directed path from S x to C m+1 , not using the 
"green" edge S^+i and respecting the directed edges in G(I n ). We note that each 
pair FjEj, Tp { , i=li.„A for t as a catsei separating Si and C m + l .Thus(S l ...C m+ j) 
contains F^ i or Tfi t for all i=l,2,.~,n, 

(3) All paths (S^Cnj+i) have to contain node C x . If they contain a backedge it 
is easy to see that they have to pass through C x at least two times. Therefore simple 
paths (containing a node only once) (S lm C m +i) do hot contain backedges. 

Let-us proceed with the proof of equivalence: 

"only if If I„ is true then Player I fitst makes n/2 moves on the v-subgraphs 
using the mixed cycles (FjEjTjDiFj), i=2,4,...,n. The n/2 moves of Player II fix 
directions for all the undirected edges FjEj, Tfa I =l4,...,n. His choice of "red" turns 
TjDi into (jp) and fixes the direction of F^ to (EjFj), (because of the directed 
path (EiTjDiFi), which now becomes part of A). This corresjionds to assigning Xj the 
value 1. Similarly his choice of "green" Jurns f^ into (f|pj) an£ fixes the direction 
of TjDj to (DiTj). This corresponds to assigning x-, the value 0. 

At this point in CONFLICT 2z-2 moves have been made and we can say that 
the choices of Player II have assigned values x*j to the variables x is i=2,4,...,n. Since 
I n is true there exist values x*j of the variables Xj, i=l,3 n-1, which make 
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F(x* 1 ,x* 2 ,...,x* n ) true. This assignment of values {x*} to variables {x} implies an 
assignment of values to the literals of every clause {u(x*)} (e.g., u=-«x, x*=l 
implies u*=0). 

Let us describe the n/2+lst move of Plater I. Consider the simple path 
(S 1 ...C m+1 )*, which consists of the following subpaths in the various subgraphs of 

G(I„)- 

(S k T k D k S k+1 ) if x* k =l, k=l,2,3,...,n. In v-subgraphs the direction of T k D k has 
been fixed to CT k D k ). In 3-subgraphs (T k D k ) is used 

(S k F k E k S k+1 ) if x* k =l, k= 1,2,3 n. In v-subgraphs the direction of F k E k has 

been fixed to (F k E k ). In 3-subgraphs (F^) is used. 
In the kth clause-subgraph the path from Cj^ to C k+1 , whose labels are the 
values assigned to the literals of the clause by {x*}. Such a path exists since no 
clause is assigned the values 000 by {x*}. 

We note that, because of the way (S I «.C m+1 )* traverses V-3- and clause- 
subgraphs, the directed edges of G(I n ) and (Si...C^ n+1 )* form no directed cycle. Note 
that no backedge has both its endpoints on (S 1 .X m+1 )*, because the labels in the 
various subgraphs along (S 1 ...C m+1 )* are consistent 

Using the rules of Fig. 3.5 Player I picks: 

A "green" set H={S 1 C m+1 } and directs it (A H ) from C^+i to S^ 

A "red" set R={ M red" edges in (Si-C^+i)*} and directs them (A R ) along the 

path (S^C^)*. 

This is a legal move since: A R UA, A H uA are acyclic, A R UA H UA is not 
Therefore if I n is true CONFLICT can last n+2 moves. 

"if" If I D is false we will prove that CONFLICIXGCy) cannot last n+2 moves. 
We will assume CONFLICT(G(I n )) can last n+2 moves and reach a contradiction. 

The move of Player I, which has "green" edge S 1 C m+1 € H must be his n/2+lst 
move. This is because, if the direction of some "green" edge has not been fixed yet, 
any simple path (Sj.X^+1) that Player I chooses would make it possible to fix the 
directions for two "green" edges. This follows from property (2) of such paths, 
proven above, and the structure of the v-subgraphs. Thus Player I must make n/2 
moves involving the "green" edges in the v-subgraphs first Moreover any choice of 
Player II will fix their direction, (by observation (0)). We will prove that there is a. 
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sequence of choices by Player II that will not let Player I move another time. 
Since I n is false then -«I n is true or, 

3X23X4...3X n VXjVX3...VX .j ->F(Xj,X2,...,X n ) 

Let the values of the Xj's, i=2,4,...,n making this formula true be x*j. For the first 
n/2 moves of Player I, each one necessarily involving a single FjEj, whose direction 
has not been fixed, the response «f Player II should be: 

If x*i=0 then "green". This fixes the directions of Tp i and F^ into (D{T) and 

(FjEj) respectively. 

If x*i=l then "red". This fixes the direction F^ into (E^). 

The n/2 + 1 st move of Player I is now constrained in several ways in order to be 
legal. First for the "green" set we inow S^^e H; becmise itis the only "green" 
edge, whose direction has not been fixed. Second for the "red" seat we know that 
{undirected "red" edges of a pafr (S 1 ...<^ n+1 )}C R. Finally (S 1 ...C m+1 ) and the 
directed part of G(I n ) must not contain a cycle. This path (Si.X m+1 ) must be simple 
(no backedges by property (3)) and thus pass through all the subgraphs: 

In a clause-subgraph it has to use one of the seven paths. 

In a v-subgraph its behavior is constrained by the way the directions of edges 

TjDj, FjEj (of which it contains exactly one) have been fixed. 

In a 3-subgraph it is constrained to contain exactly one of TjDj, FjEj. Else 

(S 1 ...C m+1 ) and the directed part of G(I n ) would contain a cycle. We extend the 

assignment {x*} in the following way for fc=l,3,...,n-l: 

If (S lt ..C m+1 ) contains Tjty then x*j=l else x*i=0. 

Thus every candidate path (Si-C^, + i) actually corresponds to an assignment 
{x*} of values to the variables and {u*} to the literals of F. This assignment can be 
read from the labels of edges along the path. In fact {x*} and {u*} are inconsistent 

By the way x*j, i=2,4,...,n were chosen every candidate assignment makes 
F(x* 1 ,...,x* n ) false. Thus a consistent assignment {u(x*)} to the literals must make the 
literals in some clause (say the kth clause) 000. Our candidate path (Sj-C,,,.^) uses 
a subpath (C k ...C k+1 ) in that clause, which has a label 1 for one of its literals. 
Because of the initial connection of backedges, the backedge of that literal ends at a 
node that belongs to the path (Sj...C m+1 ) in a V- w 3-subgraph. Therefore A R UA 
cannot be chosen to be acyclic and no candidate n/2 + 1st move of Player I can be 
legal. This is the desired contradiction. D 
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Figure 4.3 The kth clause subgraph 
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Figure 4.4 An example 
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We will now examine the game CONFLICT + (G), which has as input an 
ordered mixed multigraph G=(V,E,A,{;> i }). The edge multiset E is partitioned into 
"red" and "green". The undirected edges incident at node i belong to the partial 
order >,. The game is like CONFLICTS VJE.A)) the only difference is that 
assignments A x (corresponding to A R UA H ), A x r (containing all selected "red" edges 
and corresponding to A R ) and A^ (containing all selected "green" edges and 
corresponding to A H ) must be closed That is: 

if (ij)€ A x and ik *>j ij then (ik) or (ki)€ A x (unless of course ik already is in A). All 
this is described exactly in Definition 15 and Fig. 3-6 of Section 3.2. 

As indicated in the previous section CONFLICT (see Fig. 3.5) is a special case 
of the game CONFLICT**" (see Fig. 3.6), which is important because of its relation 
to PREFIX (;*• Fig. 3.4). The inputs of CONFLICT -1 " are slightly more general 
constructs, (i.e^ ordered mixed multigraphs), instead of mixed graphs. They are 
motivated from conflict graphs and realizable assignments of directions to their 
edges. 

From Definition 10 Section 2.1 and Lemma 1 section 2.2, we have that a prefix 
<T,a> uniquely determines an ordered mixed multigraph. This is because, given 
<T,a> we can construct G«(l)s:0^.E.A,{'> i }), which is the conflict graph (G(2)), 
with some conflicts resolved (A), some conflicts unresolved (E), and the transaction 
ordering* on the unresolved conflicts. The assignment of directions A is closed (with 
respect to the conflict graph 0(7)) and moreover if C=SR it has no directed cycles 
(see Theorem 2). . 
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LemrnaJ: Given a prefix <T,a> in PR(SR) and a nonnegative integer b, then 
tlie minimax length of PREFIX(<7,a>) equals the minimax length of 
CONFLICT* (G«(7)). 

Proof : Let us recall the following facts: 

(a) An assignment of directions (Aj) to undirected edges (Z) of the 
conflict graph G(7) is realizable by a prefix im 

(i) A z is closed (with respect to G(2)) 

Oi) A z has no directed cycle (i^y s.L: i 2 i 2 ;> a i^,...,^ > a i^. 

(b) Consider two realizable assignments A.A* of directions to edges of a 
conflict graph G(7) and let <T t a> be a prefix realizing A. It is easy to see that 
if A£A' men A*\A is closed (with respect to G a (7». 

(c) Also recall that continuations <TJ> of <r.a> in PREFIX, with 
projections a t i=l,2 have properties: 

<T,o> realizes A, A has no cycles 

<TJ1H PR(SR), <Tj> realizes A*. A' has a cycle 

<r^j>€ PR(SRX <r,«i> realizes A,. Aj has np cycle i=U 

We have that, A-\A. A^ A 2 \A are dosed (with respect to G«(J)). 

Moreover if 1 is the "red" site and 2 the "green" site and A X =A\A then we 

have A x r =A L \A, Ax*=A 2 \A. 

To prove the lemma we use induction on j, where j=|actions in T and not in o|. 
For j=0 and any b the lemma is true, since no move is possible (all conflicts are 
resolved). We will assume the lemma is true for all b and all j, 0£j£j*-l and prove it 
true for j* Forevery move in one game we will exhibit a move in the other, leading 
to assignments realizable only by stictly larger prefixes. 

"only if from the discussion above a move in PREFIX corresponds to a move 
in CONFLICT* and no matter what the choice of Player II is the resulting 
assignment of directions to the conflict graph G(7) is strictly larger than A and 
realizable. 

"if A move in CONFLICT* produces assignments A x , A x r , A x « Since these 
are closed (with respect to G*(7)) and the existing directed part of the board A is 
closed (with respect to G(7)) we have that A x uA, A x r uA, A X «UA are closed (with 
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respect to G(I)). 

We will show that A x uA is realizable by a <T,p>, which is a continuation of 
<T,a>. Using Lemma 1, Section 2.2 all that remains to be proven is that A x uA has 
no directed cycles of form (ii) above. It is easily seen that such a cycle would be 
completely contained (because of the closure property) in either A x r U A or A x 2uA. 
But since A x r uA, A^^uA, must be acyclic such a cycle cannot exist Thus A X UA is 
realizable, in fact using the construction of Lemma 1, Section 2.2 we can choose 
<T,fi> to be a continuation of <J^> . Then it is easy to verify that A x r uA, A x g UA 
are the assignments determined by the projections of<Tfi> (which are strictly larger 
than A). 

Thus when CONFLICT" 1 " has a move PREFIX has one also.D 



We will now prove that CONFUCT+(G) is P$PACE<omplete, even if the 
directed part of G is empty. 
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Lemma 3 : Given an ordered graph G=(V,E,0,{> i }) and a nonnegative integer 
b, determining whether the minimax length of CONFLICT* (G) is greater than b is 
PSPACE-Complete. 

Proof : For an arbitrary instance I n of QBF we can construct the mixed graph 
G'(I n ) using the following subgraphs. 

(a) For Xj, i=l,3,...,n-l 3-subgraphs of Fig. 4.2(b). These are similar to those 
employed in Lemma 1, with additional nodes A i ,B i ,M i ,N i and their edges. 

(b) For Xj, i=2,4,.-..n v-subgraphs as in (b) of Lemma 1. 

(c) For every clause in F(x 1 ,...^c n ) clause-subgraphs as in (c) of Lemma 1. 

(d) The connections are as in (d) of Lemma 1, with the added edges: 
directed (AjA^.,^), (Au +2 B i+2 ) i=l,3,...,n-3 

directed (A i Aj j+1 ), (Ajj +1 F j+1 ) j=13,.-4i-l 

undirected "red" AjB^, i=l,3,...,n-3, AjFj +1 j=l,3,.<.,n-l. 

An example is exhibited in Fig. 4.4. Using G'(I n ) we can construct the following 
ordered graph G(I n )=(V,E,0,{^ i }). Assume I n has n variables and m clauses: 

V: The vertex set of G'flJ with an additional vertex for every directed edge in 
GXy, which has 1^=100+351^2 directed edges. |V|=18n+64m-Z 

E: These are the undirected edges of GXy, partitioned into "red" and "green- 
as in G'Cy moreover we replace every directed edge (RQ) of G'Cy (see Fig. 4.5(a)) 
with a triangle (see Fig. 4.5(b)). Thus G{\^ has no directed edges. It is a graph 
partitioned into "red" and "green" and has 23n+n/2+91m-5 "red" and lln+35m-l 
"green" edges. 

{>j}: To every edge incident at a node i we assign a number. We use the rules 
of Fig. 4.6 and Fig. 4.5(c). The ordering ^ is the strict (no two different elements are 
equal) total ordering imposed by these numbers at i. 
For the kth triangle PQR l£k^K n , which replaces a directed edge (RQ) we assign: 

at P PQ^l+kV PR«-2+kK n 

at Q QP«-l+kK n QR<-2+kK n 

at R RP^-1+kKn RQ*-2+kK n 
For the undirected edges of G*(I n ) we use the numbers 1,2,3 as in Fig. 4.6. Note: 

at Aj AjBj^AiFi+^AjBi+2 i=l,3,...,n-l (the last for i*n-l) ' 

at F i+1 F i+l A £ F i+ 1*1+1 i=1 » 3 n_1 

at B j+2 B i+2 Ai>B i+ 2Aj + 2 i = l,3,...,n-3. 
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We will prove that I n is true iff CONFLICT +(G(I n )) can last more than 2z-2 
moves, where z=lln+35m-l (the number of "green" edges). 

"only if Assume that I n is true. We will describe a strategy that will enable 
Player I to make z moves (and the game to last 2z moves). 

First Player I plays on all the triangles, that we substituted for directed edges of 
G'(I n ). At his kth move he plays in the K n -k+lst triangle l<k^K n (PQR in Fig. 
4.5). The first move is: 
A X ={(QP),(PR),(RQ)} 
A x r ={(PR),(RQ)} 
A x g={(QP),(RQ)} 

These are closed assignments (Def. 15, Section 3.2), with respect to the position of 
the board. Moreover if (A) are the directed edges on the board before the kth move 
A X U A has a cycle, while A x r uA, A x 8uA do not No matter what Player IPs choices 
will be RQ becomes the directed (RQ) in the new A. By induction Player I can play 
similarly on all triangles. Note that when Player I has played in all K n triangles 
PQR, all (RQ)'s are in the directed part of the board and the directions of the other 
edges of the triangles have been fixed. Thus without loss of generality we can assume 
all directions on the triangles as being in A and exclude them from our further 
arguments about closed assignments. 

Now Player I will make n moves alternating between 3- and v-subgraphs (which 
correspond to the variables of I n x i; i=l,.„,n), from the subgraph of Xj to the 
subgraph for x a . Recall that QBF I n can be viewed as the instance of a game between 
two players (the 3-player and the v-player), where the 3-player has a winning 
strategy. Player I will pattern his strategy on the winning strategy of the 3-player of 
the QBF game (for moves i+K n , l<i<n). 

The i+K n th move of Player I (l^i<n) is: 

(a) If i=lX~.n-l and the 3-player makes Xj=x*j=l (based on the values x*j 
that have been assigned for l<j<i) then: 
A X ={a , i D i ),(D i M i ),(B f A i ), and (A^Bj) if i>l} 
Ax^fCTiD^XDjMi), and (A^Bj) if i>l} 
A x «={(B i A i ), and (A^Bj) if i>l} 
It is easy to check tht if the board position has directed edges A, these assignments 



72 

are closed. Also A X UA has cycle (TjDjMjBjAiTj) and A x r uA, A X §UA do not have 
any cycle (A^Bj's direction had been fixed to (A^Bj) anyway). No matter what the 
response of Player II is to this move, the path (SiT^S^) and the new directed part 
of the board form no directed cycles. 
If the 3-player makes x^O we use the symmetric cycle (FjEjNiBjAjFi). 

(b) If i=2,4,...,n then Player I uses cycle (TjDjFiEjTj). 
A x = {(T i D i ),(F i E i ),(A i . 1 F i )} 
A x r ={(T i D i ),(A i _ 1 F i )} 

A x 8={(F i E i ),(A i . 1 F i )} 

Again it is easy to see that the move is legal. But now Player IPs response is 
significant A choice "red" would correspond to the V-player assigning Xi=x*i=l 
and would fix directions to (Tp) and (EjFj). Then (8^0^+!) only forms no 
cycles with the new directed part of the board. A choice "green" would be symmetric 
( i.e. x i =x* i =0 and only (SjFjEiSj+j) forms no cycles with the new directed part of 
the board). 

We have now reached the zth (z=n+K n +l) move of Player I, and the 3-player 
has won his QBF game on I n using assignment {x*}. Thus the derived assignment 
{u(x*)} to the literals makes every clause of the formula of I n true. We can use the 
same move as was the last move in Lemma 1 and trivially check that it is legal. 

"if If I n is false we will prove, that although 2z-2 moves are possible, 2z moves 
are not, in CONFLICT + (G(I n )). In this case -«I n is true and the v-player has a 
winning strategy in the QBF game. We will pattern the strategy of Player II on this 
strategy of the V-player. 

Suppose that CONFLICT* (G(I n )) can last 2z moves. It is easy to see, that 
every move of Prayer I must contain exactly one "green" edge, whose direction has 
not been fixed by previous moves, (observation (0) before Lemma 1). So we can view 
sequences of z legal moves by Player I as permutations of the z "green" edges and 
name every move by the "green" edge, whose direction it fixes. 

(a) First let us look at legal PQ-moves, that is moves whose "green" unfixed 
edge belongs to a triangle. If this move (A x ) produces a cycle as in Fig. 4.7(a) we can 
infer the following: The edge (RQ) must belong to A x r UA and A x «uA. This is 
because A x r uA must contain a directed path (P...Q) and QR £ Q QP. (Recall that 
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QP is the only "green" edge without a fixed direction in A x ). Thus no matter what 
the response of Player II is to such a PQ-move the edge (RQ) becomes part of A. On 
the other hand a PQ-move producing a cycle (A x ) as ki Fig. 4.7(b) is never legal. 
This is because A x 2uA must contain {(PQ),(QR),(RP)} a cycle. The existence of a 
path (Q...P) in A x r UA and the fact that RQ^rPR^pQP force this situation. Thus 
PQ-moves fix the direction of QR to (RQ). Finally if Player I were ever to use a QR 
in the direction (QR), in some other e-move (e a '^reen M unfixed edge), then a 
response of "red" by Player II would consume two "green" edges (i.e., e and PQ). 
Therefore Player I should regard cages RQ as directed (RQ), in order to be able to play 
z times. 

(b) Let us examine the AjBj-moves 1=1,3^.^-1 and FjEj-moves i=2,4 M ..,n. 
Since the directed edges of QfG^ have to be respected, we can only have (BjAjJCA^c 
and (FjEj)€A x for legal assignments in these moves. This is because A x uA must 
contain a cycle ami all other edges mci<^fe at Aj (respectively Fj) have fixed 
outgoing (respectively ingoing) directions. Now we can justify the construction in 
Fig. 4.6(d) and 4.8. If (BjAi)€A x from the £ Bi order we have that (A^BjK A x uA 
(e^. the direction of A^Bj is fixed to (Aj_ 2 Bj) because of the directed patii 

( A i-2 A i-24 B i) in G *0n))- From && ^Ai-2 order we have that (B^Aj.j) or (A^B-,.^ 
A X UA. Similarly if (FiEj)€A x then (B^A^ or (A^B^ A X UA. We have 
established that the Afifmove must precede the A%+$i+x an d Fi+i$i+r moves 
i= l,3,...,n-l. 

(c) Finally let us examine the Cm+jSj-move. For this move we need a simple 
path (S 1 ...C m+1 ) that respects directed edges in G'(I n ), can contain no backedges of 
G'(I n ) (similarly to (3) of Lemma 1), and has to pass through S n and S n+1 (the last 
v-subgraph). If theF^-move has not been played yet the use of either O^D,,) or 
(F n E,j) by the (S 1 ...C m+1 ) path would fix the direction of F^. Thus the CJSj-move 
has to follow all the Afip and FjE* moves l=l,2,...,n. 

We will now show that Player II can force Player I in a game, which simulates a 
QBF(I n ) game, where Player I is the 3-player, Player II is the V-player and moreover 
has a winning strategy. Player I chooses the values of x 4 i=l,3,...,n-l and II the values 
of Xj i=2,4,...,n. Player I determines when Player II makes his choices (as long as Xj 
precedes x i+1 i=l,3,...,n-l). Thus the best Player I can do is assign a value to x x , 
force II to assign a value to x 2 , assign a value to x 3 , etc. Let us describe how these 
assignments take place. 
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(1) The AjBj-move assigns a value to x t i= l,3,...,n-l. The only possible choices 
for A x are cycles (BjAjTjDjMiBj) for x*j=l or cycles (BjAiFjEjNjBj) for x*i=0. This 
is because directed edges in G'(I n ) must be respected, and for x*j=l we have the 
following (x*j=0 is symmetric): 

( B i A i^i+2 A i+2-) would use up B i+2 A i+2 . 

(BiAiTiDiFjEi...) would introduce a cycle in A x r UA. 

(BiAjTiDiSj+j...) would fix the direction of F i+1 Ei +1 . 

The strategy of Player II will be to always play "red", fixing the directions of Tp- V 

FjEj and making vertex A { inaccessible from Sj. 

(2) The-FiEj-move assigns a value to Xj i=2,4,.«A The arguments are exactly as 
in the V-subgraphs of Lemma 1. Player II's choice fixes the direction of FjEj, thereby 
making x* t 1 or and. allowing a unique path from Sj to S i+1 as in Lemma 1. Player 
II assigns values to the x*|'s according to the winning strategy of the v-player (recall 
that I n is false and thus the v-player has a winning strategy). 

As a result of all this analysis we see that when it is time for the C m+1 S 1 -move, 
Player II has -forced F(x* lt ..^x* n ) to be false, and constrained (Sx-.G^.n) to a 
unique path through the 3- and V-subgraphs (eg., the labels on the path are {x*} 
exactly as in Lemma 1). 

Thus the arguments of Lemma 1 apply to show that the C^-move cannot be 
legal and CONFLICT* (Gfl,,)) cannot last 2z moves, d 
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We have now practically completed the proof of Theorem 4. 

Proof of Theorem 4 : In Lemma 1 we have proven that CONFLICT(G) is 
Uf-hmxL Using this lemma we have shown, in Lemma 3, that CONFLICT + (G) is 
PSPACE-Complete for G an ordered graph (no directed edges). In Lemma 2 we 
have shown the equivalence of PREFIX(<J,0>) and CONFLICT +(G( 7)). In order 
to complete Theorem 4 all we have to do is argue that the ordered graph in the 
reduction of Lemma 3 is a conflict graph for some T: 

In fact G(I n )=(V,E,0,{> i })=G(7) because, 
V: every vertex i corresponds to a transaction Tj. 

E: every edge e=fj corresponds to transactions T 4 and Tj updating a uniquely 
defined variable x e . If e is "red" x e is stored at site 1 , if e is "green" x e is stored at 
site 2. 

{>j}: All orders are strict total orders, because every edge ij is assigned a different 
number at i, thus all vertices are realizable by transactions. 

Thus we have shown PREFIX(<J,0>) to be PSPACE-CompleteU 



The question, whether PREFIX(<r,o>) can last more than b moves, has several 
interesting subcases. 

For <I»: 

(1) G a (7) is a graph and {£}} are strict (eg., eveiy, transaction updates a variable 
only once. Two transactions never share more than one variable. Three transactions 
do not share a variable). 

(2) a-0 (e.g., there are no directed edges or all conflicts are unresolved) 

(3) The transactions in T contain no cross-edges (e.g., each ;>j consists of two total 
orders one "red" and one "green". The "red" and "green" edges are incomparable. 
This actually means that there are no transaction defined messages). 

(4) The {>j} are of fixed size (e.g., no more than L actions per transaction). 

For b whether it is arbitrary or 0. 

These cases with their complexities are exhibited in Table 1. 
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conditions (<T,a>) 


Complexity 


Complexity (b=0) 


0M2) 


PSPACE-Compkte 

Theorem 4 


in NP 


(1)&(2)&(4) L=6 


PSPACE-Compkte 


in NP 




Corollary 4.1 


• 


(1)&(2)&(4) L=4 


llf-hard 
Corollary 4.1 


in NP 


(1)&(3)&(4)L=6 


PSPACE-Compkte 


NP-Complete 




Corollary 43 


Corollary 42 


(2)&(3) 


in PSPACE 


in P 




■ 


Corollary 3.3 



Table 1: Is minimax length of PREFIX(<7,a>) greater than b ? 
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Corollary 4.1 : Whether the rainimax length of PREFIX(<7/,0>) is greater than b 
is PSPACE-Complete, even if the degree of the graph G(2) is less or equal to 6. It is 
n / -hard even if the degree of G(7) is less or equal to 4. 

Proof : We will slightly modify the gadgets of Lemma 3 without changing the 
validity of its arguments. We replace clause-subgraphs with Fig. 4.9a, v-subgraphs 
with Fig. 4.9b and 3-subgraphs with Fig. 4.9c. Let us for the moment ignore the 
nodes Aj.Bj i=l,3,...,n-l. The construction gives us (by Lemma 1) that our decision 
problem is nf-hard. Moreover the only configurations at nodes are those of 
Fig. 4.10, thus our transactions need never have more than 4 actions. If on the other 
hand we. add in nodes Aj and Bj and connect A i ,B i+2 ,F i+1 using the subgraph of 
Fig. 4J.1 then the arguments of Lemma 3 are still valid. The only difference from 
Lemma 3 is that A, : Bj-moves must precede the moves in the triangles corresponding 
to (AjB i+2 ) and (A|F i+1 ). We can thus show mat our decision problem is PSPACE- 
Complete even if transactions are restricted to 6 actions. . 

Therefore [<T#>£ M c (b)?] is PSPACE-Complete even if transaction systems 
are very restrictedLD 

Consider the following combinatorial problem, which is in NP. 



PATH(G,s,t) 

Input : A mixed graph G=(V,E,-A) (V=set of vertices, E=set of undirected 
edges, A— set of directed edges) and two distinguished nodes s and t 
Output . Is there an assignment (Ag) of directions to the edges in E, such that 
the digraph (V^U A) is acyclic, and contains a directed path from s to t? 



Note that, if A is acyclic, there is always an A E * such that (V,A E *uA) is acyclic. 
Also it is easy to determine in the mixed graph G=(V,E,A) if t is reachable from s. 
But both conditions simultaneously are hard, to decide. 
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Corollary 4.2 : PATH(G,s,t) is NP-Complete, even if G has at most 2 undirected 
edges incident at a node and at most 1 directed edge incident at a node. 

Proof : Consider Lemma 1, where all edges FjEj become "red". Then the frst 
player chooses the values for all variables, and our QBF game becomes the 
satisfiability problem. Finding a proper path (S 1 ...C m+1 ) would answer 
PATH(G(I n ),S 1 ,C m+1 ). This and the refinements of Corollary 4.1 prove the 
Corollary.D 



Corollary 4.3 : Whether the minimax length of PREFIX(<T,o>) is greater than b 
is PSPACE-Complete (for b arbitrary) and NP-Complete (for b=0). This is true 
even if the transactions in T have no cross-edges and a fixed number of actions. 

Proof : Another way of stating Corollary 4.2 is that the decision question 
KTToXA/^O)?] is co-NP-CompIete. The analysis that follows (for b arbitrary) also 
applies to this case, therefore determining if PREFIX(< r,a>) can last more than 
moves is NP-Complete. 

In Lemma 3 we totally ordered all edges incident at a node, by assigning 
numbers to them. Thus the transaction system realizing the {£j} of Gfly had to 
have cross-edges. In fact cross-edges are the only way we know of forcing the 
creation of desired directed edges. 

Given an instance of QBF(I n ) we can construct the ordered mixed graph G"^ 
as follows (recall the mixed graph G'(I a ) and the ordered graph GQJ of Lemma 3): 

G'XU^fVjEAfei}) where: 
(V,EA)-G'(I tt ), with one exception. The edges AiB i+2 (i=U,...4i-3) and AjF i+1 
(i=L3,~,n-l) are "green" and not "red". 
{2^} are those implied- by the orders of G(I n ) of Lemma 3. 

We can prove that I n is true iff CONFLICT + (G"(I n )) can last more than n 
moves. The argument that is needed to prove equivalence is identical with mat of 
Lemma 3. Note that G'Xy has 2n "green" edges of which n-1 have fixed directions. 

It is easy to see that a prefix <T,a> can be constructed, from a transaction system 
without cross-edges, such that G a (7)=G"(I n ). (G a (7) is G(7) with the resolved 
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conflicts). Thus by Lemma 2, we complete the proof of Corollary 4.3. By using the 
gadgets of Fig. 4.9 we can restrict the transaction systems to sets of transactions with 
at most 6 actions (e.g., the nodes A { have two "green", two "red" and two directed 
outgoing edges. "Green" and "red" edges at the same node are incomparable). 

This proves that the decision question [<T,a>€^/ c (b)?] is PSPACE-Complete 
even for 2*s without cross-edges and with a fixed number of actions per 
transactioaD 



Fronrthis analysis of special cases we see that two sets of constraints give us 
equal power: 
{(1)&(2)&<4) L=t} and {(1)&(3)&(4) L=6} 

Let us now examine the final special case, namely b=0. Since we fix b we 
cannot use the equivalence above. From Corollary 4.2 we have that if a*0 and if T 
has no cross-edges the problem is NP-Complete. From Corollary 3.3 if T has no 
cross-edges and a=0 the problem is in P. 

We have left open two interesting problems: 

(a) Given T without cross-edges and b^O, is the minimax length of 
PREFIX(<r,0>) greater than b moves? We conjecture this problem is PSPACE- 
Complete. 

(b) Given T can PREFIX(<J,0>) last more than moves? This problem is in 
NP and we conjecture it is also in P. 
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4.2 The Efficiency of Communication-Optimal Schedulers 



In the previous section we have analysed the complexity of various cases of 
PREFIX, or equivalently examined various cases of the decision problem 

In Section 3.1 we described a programming system in which we can express all 
distributed schedulers. These schedulers consist of two processes, one at each site 
(Q1.Q2) m & realize SR (Definition 12, Section 3.1). That is an input history h€H can 
lead to many possible computation paths. By executing the instructions on such a 
path the scheduler outputs a history in C. For each path the output is in C, moreover 
if h€C and the delays of all messages are the output must be h. We call the 
scheduler polynomial time bounded if the number of instructions the processes 
execute is bounded by a polynomial in n (for all possible paths). The size of the 
input is measured by n, which is the number of actions in T. 

Corollary 4.4 : Unless NP^PSPACE t there is no communication-optimal 
scheduler, which realizes SR and is polynomial time bounded. This is true even if 
each transaction is restricted to be a sequence of six updates. 

Proof : Suppose such a scheduler Q existed. We know that [KTjSHMfi))!] is 
PSPACE-Complete (even for restricted transaction systems, Corollary 4.1). We will 
prove that there exists a nondeterministic polynomial time bounded decision 
procedure for this problem. This, would imply that NP=PSPACE t an unlikely fact 

Given T and b£0 we do the following: 

(1) guess a history h =<!>>€ SR (this can be easily checked) 

(2) simulate the operation of Q on this history 

(3) whenever a message is sent we guess its delay and in general guess a 
computation path of Q. 

(4) keep count (with m) of the number of messages sent 

(5) if m>b then say yes else say no 
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If [< J,0>€A/ c (b)?] is true there will exist an input h and a computation path of 
Q, where more than b instructions are executed. We can guess the input and the 
computation path with a polynomial number of guesses, this is because the size of h 
is O(n) and all paths are polynomial bounded. If m>b that means that all schedulers 
have to use more than b messages for inputs from the transaction system T. This is 
obviously a nondeterministic polynomial time bounded algorithm for our problem.D 



Similar results would hold even if we augmented our programming system with 
the power to consult oracles in the polynomial hierarchy [11} (i.e., the hierarchy 
would collapse beyond a certain level). 

Let us note two open problems. 

(a) If we assume P=PSPACE it follows that we can construct efficient 
schedulers (in both measures). The consequences of NP-PSPACE on the other 
hand are unclear. 

(b) If the decision problem [<r,0>CM c (b)?] is only NP-hard the arguments of 
Corollary 4.4 no longer apply. 

Our results indicate that, a communication optimal scheduler must be 
computation inefficient It is still possible to analyze the information in J and design 
various efficient communication subpptimal realizations of SR. We will end this 
section by defining a simple open edge deletion problem. This problem can be used 
as an upper bound on the minimum number of messages in order to realize SR. 
Because of its simplicfty it is also of independent combinatorial interest 

DMC(G) 

Input An undirected graph G, with edges partitioned into "red" and "green" 
Output : Find the minimum number of edges, whose deletion produces a graph 
with no cycles containing both "red" and "green" edges. 



84 



5. The Combinatorics of Locking 

The most common technique used for the resolution of conflicts in concurrency 
control is locking. In this chapter we will extend the elegant analysis of locking 
described in [39] from the centralized to the distributed case. In the process, the 
geometric criterion of [39] will be replaced by a simple combinatorial condition 
(i.e., the strong connectivity of a directed graph). 



5.1 Distributed Locking 

Let us first present a simple extension of the definitions for locking, which 
appear in [39]. We will utilize the notions of Distributed Database Design (DDD), 
transaction, action, history and serializability from Section 2.1, with the following 
additions: 



Definition 16 : For the DDD=<G^> Data, Stored-at, TO, the Data is partitioned 
into variables (Var) and locking variables (LVar). The function lock-of. Var-*LVar 
determines for every variable x, its lock X, (i.e., X is the lock-oj(x)). The constraint 
A (X€LVar) X=0 is part of the integrity constaints ICQ 



We will use x for variable and X for its lock. Note that, as for ail Data t locking 
variable X is stored-at site(X). We might have that siie(x)*site(X) (e.g., a central site 
is used for all locks). We might have that X is the lock of x only and site(x)=site(X) 
(e.g^ the fully distributed case). Or we could have two variables, which are at the 
same or different sites, and have the same lock (e.g., primary copy locking). The 
locks we will be dealing with are stored at a particular site, and are not global 
variables stored at many sites. 

The transactions and histories are partial orders of actions as in Definitions 2 
and 4, but we can have more types of actions. 
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Definition 17 : An action is either an update of a variable (in Var) as defined in 
Def. 3 or a lock X or unlock X step for some locking variable X (in LVar). 

(a) The semantics of "lock X" are, (X: = if X=0 then 1 else error) 

(b) The semantics of "unlock X" are, (X:= if X=l then else error) 

We abreviate "lock X" as Lx and "unlock X" as Ux, where X=lockqfo).Q 



Note that we are dealing with exclusive locks. We will not discuss shared locks 
(e.g., read or intention locks (13D 

Let r={T 1 ,T2,...,T m } denote an (ordinary) transaction system, that is without 
"lock" or "unlock" steps. 



Definition 18 : A locking policy L is a mapping, which given an (ordinary) 
transaction System T transforms it into a locked transaction systeln L(7). The 
locking policy transforms each Tj of T into TCTj) fl=lX..,m), by inserting only 
Lx,Ux steps and precedences between them subject to the following constraints: 

(1) The only way to insert Lx or Ux steps, is as a Lx-Ux pair with Lx before and 
Ux after an update of x, in the partial order of W$. Moreover for each x there is at 
most one Lx-Ux pair m L(Tj). 

(2) For every update of an x in Tj there is a lx befole and an Ux after it in the 
partial order of L(Tj).n 



Note that a locking; policy could be nondeterrainistic (i.e. it could produce many 
different U7)'s for a given 7). 

In a locked transaction LCTj) all actions at toe same site are totally ordered, by 
Def. 3 of transactions. As in the case without Jocjcs, a distributed locked transaction 
represents a set of total orders of its actions (i^ those that respect its partial order). 
A new feature for the distributed case is: we can have actions p,q concurrent in Tj 
and Lx's, Ux's inserted in Tj, with such precedences as to make p an ancestor of q in 
L(Tj). In other words the locking policy can restrict the parallelism inherent in Tj. 
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Let h be a history (or a prefix of a history) of L(7). We say that h is legal 
(i.e. preserves the IC of locks in Definition 16) if between any two occurrences of Lx 
in h there is an occurrence of Ux. We denote this as h€M(L(7)). Let L"*(h) be the 
induced subgraph of h if all lock and unlock steps were removed. The set of histories 
0(L)=L"1(M(L(7))) is called the output of the locking policy L and captures the 
parallelism supported by L. 



Definition 19 : A locked transaction system L(7) is safe if every history in 0(L) 
is serializable. It is deadlock-free if for any legal prefix a of a history of L(7), there is 
a suffix w such that a.« € M(L(7)).n 



It is easy to see that if L(7) is safe we can realize M(L(7) using a scheduler, 
which consists of a simple lock manager and a mechanism for avoiding or breaking 
deadlocks. The deadlock problem becomes more accute in a distributed 
environment, where it requires the use of messages [22,23]. 

As an example of a distributed locking policy consider two-phase locking (2PL). 

2PL : All lock steps in a distributed locked transaction must precede all unlock 
steps in the transaction's partial order. 

Every total order consistent with a 2PL distributed transaction is a 2PL 
centralized transaction. Thus we can infer, from the safety of centralized 2PL, its 
safety for the distributed case. Similar easy generalizations exist for the safe and 
deadlock-free tree-[30J, digraph-[39] or hypergraph-[39] policies, which apply to the 
structured Data case. 

An example of a distributed 2PL transaction system is presented in Figure 5.L 
This example also shows that 0(2PL) (ie., the set of legal output histories without 
the locks) is not a concurrency control principle as defined in Def. 11 Section 2.1. 
This is because the ordering of lock, unlock steps introduces cross-edges that were 
not part of the initial transactions T. 

Our main task now will be to generalize the results of [39] towards a 
characterization of safe systems. 
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5.2 The Safety of Distributed Locked Transaction Systems 

Let Tj (i = 1,2) denote a pair of locked distributed transactions and T| + (i = 1,2) a 
pair of totally ordered locked distributed transactions. The jth step of Tj + is Ty"** 
l<j<mi- As noted above Tj={Ti + | Tj+ respects ^1 (i=l,2). 

Consider a transaction system {Tj + , i=l,2}. In the coordinated plane 

(Ti 4 "^" 1 ") (s* Fi S- 5 - 2 ) teke ^ Vff0 ^ to correspond to Tj+and T2 + , and the 
integer points 1,2, etc. on these axes to correspond to the steps Tu + ,T]2 + . etc 
(respectively T2i + ,T22 + » etc -) °f *C transactions. A point p may represent a 
possible state of progress made toward the completion of Tj + and T2 + . These 
transactions will contain properly nested lock-unlock steps. Each variable x such that 
both Ti + and T2 + contain a Lx-Ux pair, has the effect of creating a forbidden 
region (a rectangle delimited by the grid lines corresponding to the Lx-Ux steps), the 
points of which do not represent reachable states (see Fig. 5.2). Adding such 
rectangles to the plane has some consequences. For example, the point u is now 
reachable, yet not in any. rectangle; in contrast, point d is a state of deadlock. 

A history, that is totally ordered, has the following geometric image[39J. It is a 
nondecreasing curve from the point (0,0) to the point (m2+l, mj+1), not passing 
through any other grid point and not through any rectangle (e.g. h in Fig. 52). To 
read the history off any such curve we simply enumerate the grid lines that it 
intersects. Two totally ordered serial histories are represented by the curves hj,!^ in 
Fig. 52. 




5 6 7 
<J Ux Uj Li 2 Uz 

Figure 5.2 The C^ + ,T 2 + )-plane 



'2 
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From [39] we have the following characterization. 

Proposition 2 : A history, which is totally ordered, for the transaction system 
{Tj + , i = 1,2} is not serializable iff the corresponding curve separates two 
rectangles.D 

No two rectangles touch at a grid point (by our definition of locked transaction 
systems). In order to study the safety of {T[ + , i = 1,2} the onk actions we have to 
consider are pairs of Lx-Ux steps, where both Tj+'s update x. Tbefollowing Lemma 
for distributed locked transactions is a direct consequence of Proposition 2, because 
every nonserializable history corresponds to some set of totally ordered 
nonserializable histories. 



Lemma 1 : A distributed locked transaction system {T]^} is safe iff for all 
pairs Ti+,T2 + there is no curve (corresponding to a history) that separates two 
rectangles in the CTi + ,T2 + )-plane.Q 



An example of an unsafe system {Tjjy, where only relevant Lx-Ux steps are 
given, is provided by Fig. 513(a). In Fig. 5.3(b) we have a pair Tj + ,T2 + that 
happens to be safe. In Fig 53(c) we have a pair Tj + ,T2 + that illustrates why the 
system is unsafe. 

Since there is an exponential number of possible pairs Ti + jT2 + an iterative 
application of the test of Proposition 2 (which involves an 0(nlognloglogn) 
computation of a "closure" for a geometric region of rectangles [21]) is no longer 
efficient 

Our contribution will be an efficient combinatorial (as opposed to geometric) 
test (i.e. sufficient condition) of safety for distributed locked transaction systems. Our 
combinatorial test (Theorem $) provides an alternate way of characterizing the 
centralized problem. It is also a necessary condition of safety (Theorem 6) for 
centralized transactions and transactions distributed at two sites. For more sites a 
complete and efficient characterization is an open problem. 
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(a) Distributed locked transactions (x at site 1 and y,z at site 2) 

(b) safe {T 1 +,T 2 + } 
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Let us define: 



DL(Tj,T2) : Given two locked distributed transactions Ti,T2 construct the 
digraph DLCT^MV.A) such that: 

(a) V the vertex set, with vertex x iff both Tj and T2 contain a Lx-Ux pair. 

(b) A the arc set, with arc (xy) iff ( Ly > T1 Ux and Lx ^ Uy ). 



An example of DLCri,T 2 ) is presented in Fig. 5.3(d). From the definition of 
DL(T]/T2) we have that (xy)€ A iff the upper-left corner of the x-rectangle is in the 
lower-right corner formed by the y-reetangle on all possible (Ti + ,T2 + )-planes (see 
Fig. 5.4). This implies that in every such plane no curve corresponding to a history 
can pass below the y-rectangle and above the x-rectangle* 



,*yl/»V*i) 




Figure 5.4 
(xy)€ DLCiy^- Only three types of paths are at most feasible. 
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Theorem 5 : Let Oi/iy be a locked transaction system. If DL(T]^,T2) is 
strongly connected, then {Tjjy is safe. 

Proof : Let Tj and T2 conflict at variables xi^,...^. Then for 
DL(Ti,T2)-( v » A ) we have V={xi,X2,...,xj : }. 

In a (Ti + ,T2 + )-plane we can associate every path s, that corresponds to a 
possible output history of a lock manager, to a vector of k binary values 
s=(bi,b2,...,bk). These values are: 

bj=l if s passes above the Xj-rectangle 

bj=0 if s passes below the xj-rectangle 

Therefore if (xjXj)€ A we can say that for all Oi+J^+J-planes and paths s 
bj<bj (Le. only bj-1, bj=l or bj=0, bj=0 or bj=0, bj=l are allowed). 

_ _ _ * 

Since DL(Tj,T2) & strongly connected there is a directed path (xj...xj) and a 
directed path (xj..jq) for l<ij<k, i*j. Thus always bj<„.^bj and bj^...<bj for all ij. 
This implies that the only allowable values for the vectors s are (0,0,...,0) and 
(1,1,...,1). Thus for all CTi + ,T2 + )-planes there is no path corresponding to a history 
separating two rectangles. Therefore OV^} is safcQ 



In order to characterize safety of a distributed system we need a succinct way of 
describing the forbidden- regions in all (Ti+,T2 + )-planes. We use this 
characterization (as in the proof of Theorem 5) to produce a short proof, that all 
paths, which correspond to output histories of a lock manager, must either pass 
below or above all forbidden regions. 

The simple condition of safety provided by Theorem 5 is a sufficient one. It is 
necessary for centralized transactions (Lemma 2), where another obvious complete 
characterization is the geometric pattern on the unique (Ti + ,T2 + )-pIane. It is also a 
necessary characterization for transactions distributed between two sites (Theorem 6). 
Recall that the safety question is in co-NP t whereas its negation is in NP, that is to 
prove a system unsafe all we have to do is guess a nonserializable history in O(L) and 
verify that fact in polynomial time. 

We should point out that DL(T].,T2) ignores some of the precedences of T\ and 
T2. This restricts the proof of necessity to two sites and indicates that a complete 
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characterization of forbidden regions for an arbitrary number of sites could be a hard 
problem. 

If DL(Ti,T2)-( V ' A ) is not strongly connected then it has more than one 
strongly connected components. Among these there is a strongly connected 
component with no incoming edges from other strongly connected components. We 
call such a component a dominator X, where X£V denotes its set of nodes. In fact 
the only property of the dominator we will use is that there are no incoming edges in 
X from nodes in V\X (aind not its strong connectivity)! 

We will prove necessity of the condition in Theorem 5 using the following 
intuitive construction. Given Ti,T 2 , DL(Ti,T2)=(V,A) not strongly connected and a 
dominator X, we will construct two special total orders Ti + ,T2 + . In T^ + the 
actions (Lx-Ux, x€X) will be executed as late as possible after the actions (Lz-Uz, 
zCX). In T2 + we do the opposite, this tends to isolate the forbidden region 
corresponding to X in the upper left corner. Each time we will argue that this region 
and all other rectangles can be separated as in Fig. 5.5, by a curve which will 
obviously correspond to a possible output history. Therefore we will prove something 
stronger than lack of safety namely: "If X is such that there are no incoming edges in 
X, then we can separate all x-rectangles from all z-rectangles, x€ X, z€ V\X". 



Lemma 2 : Given a locked transaction system {Tj/^}, where Tj/r^ are totally 
ordered, if DUJ1X2) is not strongly connected men {Ti,T 2 } is unsafe. 

Proof : Obviously there is only one (Ti+ZI^+Vplane. Pick a dominator X in 
DUT^). By Theorem 5 all its rectangles form a region that is above an increasing 
curve, whose corners correspond to lower right corners of Xj-rectangles, Xj€ X 
(see Fig. 5.6). Let z£ X, then the z-rectangle must be below that curve. If it is not 
there is an xj€ X such that Lz ^ Uxj and Lxj > T1 Uz (since Tj, T2 are totally 
ordered) implying that (zxj)€ DLOi.T^' a contradiction.D 
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Theorem 6 : Let 7={Ti,T2} be a locked transaction system, where Tj,T2 are 
distributed at two sites. If DLHi.T^ is not strongly collected then T is unsafe. 

Proof : For this type of distributed transactions there could be an exponential 
number of possible Ha +,T 2 "^planes. Let x ** * dominator of DLCT^). We use 
X to construct two special total orders Ti + ,T2* that Will help us separate all x- 
rectangles (xe X), from all z-reetangles {zi X) and, since X and V\X are nonempty, 
this will provide us wim a certificate of unsafeness. We will use the shorter notation 
>i instead of ^ and >j for "precedes or can be concurrent to in transaction Tf. 

Let z, x, y be such (if they exist) that: 

(1) zC.X and x,y£ X . 

(2) Lz > 2 Ux and Ly > x Uz 

Then we can infer: . 

(3) x*y and Uy > 2 Ux and Uy >j Ux. 

Since X is a dominator of DLCiyiy it cannot contain either of the directed edges 
(zx) or (zy). We can infer (3) because, if x=y (zx)€ 1^! ,T 2 ), or if (Ux > 2 Uy) then 
(Lz > 2 Uy) and (zy)€ DLOiJ^, or finally if (Lx \ Ly) then (Lx > x Uz) and (zx)€ 
DLOi.12). 

For any z, x, y satisfying (1),(2) and (3) we can construct the following partial 
orders: 

T{ is Ti with the added precedence Ly > r Lx 

T2' is T2 with the added precedence Uy > 2 » Ux 
Obviously Tj' (i=l,2) are partial orders. Also if is Tj (i=l,2) with at most one 
precedence added (Le., if &eadoUtk)nal precfiden€ewer# ahready in Tj then Tj =Tj). 
Therefore if {T^'} is unsafe so is fli.TjJ. 

Based on the existence of only two sites we will prove the following important 
fact about the new system r=Oi,T2}: 

(I) X is a dominator of DLCTi^') 

Since x, y, z are distinct variables we have three qases; case (a) x,y stored at the 
same site, case (b) x,y stored at different sites and z stored at the same site as x, 
case (c) x,y stored at different sites and z stored at the same, site as y. 



% 

Case (a) : If x,y are stored at the same site we must have (Ly > x Lx) and 

» 
(Uy > 2 Ux) (these actions cannot be concurrent in Tj or T2). Therefore Tj =Tj 

(i=l,2) and (I) follows trivially. 

Case (b) : We have that x and z are stored at the same site and (Lz > 2 Ux) (the 
possible positions of Lz are illustrated in Fig. 5.7). Since (zx)$ DLdijy we must 
have (Uz >j Lx) (i.e. these actions cannot be concurrent in Tj, because x and z are at 
the same site). Since (Ly >j Uz > x Lx), we have that already (Ly > x Lx) and therefore 
T 1 =T 1- We onlv add precedence (Uy > T Ux) to T2 to obtain T2. 

The only way for new edges to be generated in DLCTx^') from a z'C X into a 
x*€ X, is for (Lz* > 2 Uy) and (Ux £ 2 Ux ') (*' could be x). Moreover z* and x' should 
be stored at different sites (otherwise Lz'.Ux* would have been ordered already in 
T2) and in Tj=T^ we must have (Lx' >j Uz'). 

If 2' and x were stored at the same site, x* must be stored at the site of y. Thus in 
T2 we must have had (Lz* > 2 Uy and Uy > 2 Ux') (otherwise the new edge would 
have introduced a cycle in T2). Therefore Lz* and Ux' were already ordered in T2, a 
contradiction. 

If z' and y were stored at the same site, x* must be stored at the other site and 
Fig. 5.7 illustrates the possible positions of Lz' and Ux* in T2. From these ranges of 
Lz' and Ux' in T2, we can. derive the possible positions of Uz' and Lx* in Tj. Since 
DL(T2,T2) cannot contain either (z'y) or (zx') and since (Lz' > 2 Uy and Lz > 2 Ux'), 
we must have (Uz' >± Ly and Uz >i Lx*). It easily follows from the established ranges 
that T^ contains a cycle (UzLx'Uz'LyUz) a contradiction. 

This proves (I) for this case. 

Case (c) : This case is symmetric with case (b). The argument that proves (0 is 
similar to the one above. The ranges of Lz', Uy* in T2 and Uz', Ly' in Tj are 
illustrated in Fig. 5.8. This time the additional precedence is (Ly > r Lx), and z'€ X, 
y'€ X, and z' must be stored at the site of x, and y' at the site of y. 

This completes the proof of (I)- 

Starting from fwe can construct a sequence of transaction systems T,T ,...,T 
(of length polynomial in |7p such that in T : 
(i) X is a dominator of DLC^*/^*) 
(ii) If (z<£ X), (x,y€ X), (Lz > 2 * Ux), (Ly > 2 * Uz) then (Uy > 2 . Ux), (Ly > x *Ux). 
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Now all we have to do is produce the total orders Tj"*", Tj + from topological^ 

sorting T^ , T2 . We use two tricks First, we place the Ux (i.e. x in X) steps as early 

as possible in T2 + . Second, we place the Lx (i.e. x in X) steps as late as possible in 

Ti + , moreover if Ux is before Ux' in Ti + we put Lx before Lx' in Tj + (if 
possible). 

It is easy to see that a nondecreasing curve lower-bounding the area of the 
rectangles in X is created. Also if (Ly > 1+ Uz) for some z€ X, and Ly forms part of 
this curve and is closest to Uz (see Rg. 5.9) then we can easily prove that 
(Ly >!• Uz). (From the way T] + was constructed, if there is a closer (Ly c > A * Uz) 
we must have (Ly > x . Ly^ else Ly c would have been scheduled before Ly in T\ + ). 
From the properties of T* we know that for all x€ X such that (Lz > 2 * Ux) we have 
(Uy > 2 * Ux). By the way T2 + was constructed (Uy as early as possible) we can infer 
(Uy > 2+ Lz). 

Therefore z-rectangles are below or to the left of all x-rectangles in the 
0l + ,T2 + >plane. This completes the proof of Theorem 6X1 
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The condition of Theorem 6 cannot be applied to systems {Tj/^} distributed at 
more than two sites. An example demonstrating that fact is illustrated in Fig. 5.10, 
where although we have a dominator X={xj,X2} (Fig. 5.10(a)) we cannot separate it 
from the other rectangles (Fig. 5.10(b) and (c)). 
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(b) J is not a transaction system 

(c) DL(7) has dominator {xj^J 
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Thus we can test safety of distributed transaction systems J={T^,T2}, on two 
sites in 0(tr) time [1], In fact the proof of Theorem 6 gives us the following 
nondeterministic polynomial time algorithm to decide if an arbitrary system T is 
unsafe. 



Algorithm UNSAFE : Given r={T^,T2} a locked transaction system. 

(1) Guess a (nonempty) set of rectangles, X that are above a curve, which 
corresponds to a nonserializable history. Let Z be the (nonempty) set of the rest of 
the rectangles. 

(2) Start with Ty =T|, T2 =T2 and keep augmenting them by the following 
rule: 

If z€ Z, x,y€ X, (Lz > 2 * Ux), (Ly > x * Uz) then add (Uy > 2 * Ux), (Ly > x * Lx). 

(3) Check if Tj , T2 are partial orders and if DLCTj , T2 ) has no edges (zx) 
for z€Z, x€ X. 

(4) If (3) is true say yes. 



The nondetemimisrJc choice at step (1) indicates that the decision problem 
"Given T^Oijy is it safe?" may be co-NP-Complete. Such a result would be 
interesting since it would illustrate the effect of multiple sites on the complexity of 
the problem. 

Until now we have discussed transaction systems T with two transactions. The 
question of safety of a system with an arbitrary number of centralized transactions is 
co-NP-Complete [39], because of a combinatorial condition introduced by the 
conflict graph G(7). Since the question of safety of a system of an arbitrary number 
of distributed transactions is in co-NP, we cannot hope to indicate a difference 
between centralized and distributed by further pursuing this problem. 

Another interesting issue is that of deadlock freedom. For the centralized case 
the geometric approach used for safety [39] gives us a test of deadlock freedom at no 
extra cost. The approach using DLCTj^) does not have this nice property. 
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Therefore we have determined three interesting open problems: 

(a) Given a system Ol.T^} of arbitrary locked distributed transactions, is it 
safe? 

(b) Can the polynomial time bounds implied by Theorems 5 and 6 be improved 
using the special structure of DLfTiJ^)? 

(c) Given a system {T^J^} of locked distributed transactions, is it deadlock-free 
(even if two sites are used and the system is safe)? 
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6. Conclusions and Open Problems 

We have examined the complexity of distributed database concurrency control. 
We have provided a rigorous mathematical framework for the study of on-line 
distributed problems (Chapter 2), established a connection between distributed 
computation and combinatorial games (Chapter 3) and finally derived both negative 
(Chapter 4) and positive (Chapter 5) complexity results. 

Our main result (Theorem 4) shows that concurrency control, an on-lne problem 
clearly in NP in the centralized case, is PSPACE-Complete in the distributed case. 
This result is quite strong, in that it holds for transaction systems of rather ordinary 
appearance (e.g., transactions consisting of sequences of sue updates each). Also, the 
negative implications of our result (Corollary 4.4) are quite robust For example, 
even if the scheduler is equipped with a powerful oracle belonging anywhere in the 
polynomial hierarchy, it still cannot minimize communication efficiently, unless the 
polynomial hierarchy collapses. 

In the process of proving this negative result, we have related distributed 
concurrency control to certain combinatorial games played on graphs. It could be 
that this connection is of some practical value, since the length of these games 
corresponds to counting messages. There is a more-or-less immediate heuristic for 
approximating an optimal strategy in the game CONFLICT. This heuristic is based 
on the following purely combinatorial problem, which is still open: 

(I) "Given an undirected graph with its edges colored red and 
green, find the smallest set of edges that have to be deleted in 
order for the resultig graph to have no two-color cycle." 

Other open problems from Chapter 4 are related to technical issues (II)&(HI) or 
to the messages- v.s. computation steps argument of Corollary 4.4 (IV)&(V). This last 
argument seems quite general in the context of distributed computation. 

(II) Given J without cross-edges and b>0 is the minimax length 
of PREFIX(<7,0>) greater than b? (conjectured to be PSPACE- 
Complete) 

(III) Given T is the minimax length of PREFIX(<7,0>) greater 
than 0? (conjectured to be in P) 
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(IV) What are the consequences of NP = PSPACE on the 
existence of efficient schedulers? 

(V) Can a contradiction similar to Corollary 4.4 be derived if 
[<r,0>€A/ c (b)?] is NP-Complete. 

In Chapter 5 a new 0(n2) safety test was derived for two-transaction locked 
systems Oyiy. This is a necessary and sufficient condition, if transactions are 
distributed at two sites, and sufficient otherwise. There are a number of interesting 
open problems. 

(VI) Given {T^} distributed at an arbitrary number of sites 
are they safe? (conjectured to be co-NP-Complete) 

This would demonstrate the complexity introduced by the number of sites. 

(VU) Given {T lt T 2 } distributed at two sites and safe, are they 
dead-lock free? 

Issues of local and global deadlocks. and message-efficient deadlock managers 
recall the analysis of Chapters 3 and 4. 

(VIII) Can the polynomial bounds of 0(n2) (n is number of 
nodes of the digraph DL) implied by Theorems 5 and 6 be 
improved using the special structure of DL? 

This is possible in the 0(nlognloglogn) centralized case. 

Finally our analysis of distributed locking can serve as the basis for the 
development of novel distributed locking strategies, which are not simply 
generalizations of centralized rules. 
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